<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>A. Casamayor);</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>ELiRF-UPV at MentalRiskES 2025: Spanish Longformer for Early Detection of Gambling Addiction Risk</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Andreu Casamayor</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Vicent Ahuir</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Antonio Molina</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Lluís-Felip Hurtado</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Valencian Research Institute for Artificial Intelligence (VRAIN), Universitat Politècnica de València</institution>
          ,
          <addr-line>Camino de Vera s/n, 46022 Valencia.</addr-line>
          <country country="ES">Spain</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2025</year>
      </pub-date>
      <volume>000</volume>
      <fpage>0</fpage>
      <lpage>0001</lpage>
      <abstract>
        <p>This paper describes the approaches the ELiRF-VRAIN team took in the shared tasks of MentalRiskES at IberLEF 2025. The tasks focus on the detection of mental health disorders in Spanish-language social media, specifically: Risk Detection of Gambling Disorders and Type of Addiction Detection. We have developed three approaches: one based on Support Vector Machines, and two based on Transformer architectures, RoBERTa and Longformer. For the Transformer models, we continued pre-training the base models to adapt them to the mental health domain, resulting in two models specifically tailored for this area. During the fine-tuning phase, we applied a data augmentation process using the data provided by the organizing entity. According to the results obtained, our approaches align well with the objectives of the tasks.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Longformer</kwd>
        <kwd>Transformers</kwd>
        <kwd>Support Vector Machine</kwd>
        <kwd>Mental disorder detection</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        Mental health conditions, including depression, anxiety, and schizophrenia, have become critical global
issues, afecting millions of people worldwide. According to the World Health Organization (WHO),
mental disorders are characterized by clinically significant disruptions in an individual’s thinking,
emotional control, or behavior [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ].
      </p>
      <p>
        Alarmingly, nearly one in eight individuals worldwide has a mental illness, with a significant
proportion of cases remaining undiagnosed and untreated [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. Despite growing awareness, the prevalence of
mental health disorders continues to rise, and stigma and discrimination toward afected individuals
persist. Governments are investing in prevention and treatment initiatives, but the human and material
resources shortage limits access to adequate care for many. Furthermore, early detection of mental
disorders remains a significant challenge.
      </p>
      <p>Early detection of mental illnesses is therefore essential to improving individuals’ lives and reducing
their impact on society. Significant progress has been made in automatic detection through the analysis
of social media text. However, numerous challenges still hinder this task, including data quality, quantity,
and availability. The goal of the MentalRiskES shared tasks is to provide high-quality labeled data in
Spanish and to encourage the development of models for the early detection of mental health disorders.</p>
      <p>
        In the 2025 edition [
        <xref ref-type="bibr" rid="ref2 ref3">2, 3</xref>
        ], the competition consisted of two tasks: (1) Risk Detection of Gambling
Disorders and (2) Type of Addiction Detection. Our team participated in both tasks.
      </p>
      <p>
        We considered three diferent approaches.
1. The first approach is based on a classical machine learning algorithm: Support Vector Machines
(SVM). SVMs have shown reliable performance in long-text classification tasks, making them
suitable for this scenario. This approach serves as a baseline to evaluate the efectiveness of
traditional models.
2. The second approach leverages a Transformer-based model [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. Specifically, we employ a
pretrained RoBERTa model [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] as the foundation and perform fine-tuning to adapt it to the
taskspecific domain. For the fine-tuning process, we consider two distinct datasets: the oficial
dataset provided by the organizers and an expanded version obtained through data augmentation
techniques.
3. The final approach follows a similar strategy to the second one; however, to capture a broader
context, we utilize a pre-trained Longformer model [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. The model can leverage more contextual
information thanks to its ability to process longer input sequences. For the fine-tuning phase, we
use the same datasets as in the previous approach
      </p>
      <p>For Tasks 1 and 2, we submitted three runs, one for each approach described above. The
bestperforming model was selected in each run through a preliminary evaluation phase, where we tested
multiple model configurations and datasets.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Description of Dataset and Tasks</title>
      <p>
        The datasets provided by the organizers consisted of a collection of messages sent to various public
Telegram groups and Twitch [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. These groups are characterized by their focus on gambling, and
all communication is in Spanish. User labeling was performed through manual and semi-automatic
annotation, where users were classified based on the presence or absence of signs of problematic
behavior related to gambling addiction.
      </p>
      <p>This year, the focus is on gambling addiction. All users included in the dataset exhibit signs or
symptoms of this disorder. The dataset is distributed as follows: 7 users for the trial set, 350 for training,
and 160 for testing. It is the same dataset for both tasks; the only aspect that varies is the label assigned
to the user, but the content is the same.</p>
      <p>The main objective of this competition is to predict mental disorders as early as possible. To simulate
a realistic scenario, the organizers implemented a server-based setup that delivers data in packets, each
containing one message per user. The system states the prediction for each user based on the current
and previously received messages before the next packet arrives. The ultimate goal is to identify the
presence of a mental disorder, if any, as early as possible in the message stream.</p>
      <sec id="sec-2-1">
        <title>2.1. Task 1: Risk Detection of Gambling Disorders</title>
        <p>Task 1 is a binary classification task aimed at predicting whether users are at high or low risk of
developing gambling addiction.</p>
        <p>Table 1 shows the distribution among the diferent labels in the dataset for the first task.</p>
        <sec id="sec-2-1-1">
          <title>Low risk</title>
        </sec>
        <sec id="sec-2-1-2">
          <title>High risk</title>
        </sec>
        <sec id="sec-2-1-3">
          <title>Total</title>
        </sec>
        <sec id="sec-2-1-4">
          <title>Train</title>
          <p>178
172
350</p>
          <p>To maximize the number of available samples for training, we combined the Train and Trial partitions.
The Total column in Table 1 displays the final distribution of samples in our training dataset.</p>
        </sec>
      </sec>
      <sec id="sec-2-2">
        <title>2.2. Task 2: Type of Addiction Detection</title>
        <p>Task 2 is similar to Task 1; however, for all users, regardless of whether they are at low or high risk,
the objective is to predict the specific type of addiction they exhibit. This task is therefore framed as a
multi-class classification problem, with the following categories: Betting, Online Gaming, Trading and
Crypto, and Loot Boxes. Every user exhibits one addiction, whether this addiction is in a low-risk or
high-risk state.</p>
        <p>The label distribution in this total dataset can be seen in Table 2.</p>
        <sec id="sec-2-2-1">
          <title>Betting</title>
        </sec>
        <sec id="sec-2-2-2">
          <title>Online Gaming</title>
        </sec>
        <sec id="sec-2-2-3">
          <title>Trading and Crypto</title>
        </sec>
        <sec id="sec-2-2-4">
          <title>Loot Boxes</title>
        </sec>
        <sec id="sec-2-2-5">
          <title>Total</title>
        </sec>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. System architecture and Techniques</title>
      <p>For this competition, we aimed to investigate three essential factors for this type of task: the size of the
context, domain-specific pretraining on base models, and task-specific fine-tuning approaches.</p>
      <p>The first factor is the amount of context required for accurate detection. Since each user may generate
a large number of messages, the input size becomes a critical consideration. One of our goals was
to investigate the impact of contextual information on system performance—that is, to assess how
well diferent models perform depending on the amount of context they can handle. To this end, we
evaluated three distinct systems: one based on classical machine learning approaches, another using
a RoBERTa model, and a third employing a Longformer model. Each of these systems can process
diferent input lengths, allowing us to analyze the efect of context size on prediction accuracy.
1. Classical machine learning approaches have no limit on the input size.
2. The selected RoBERTa model has a limit of 512 tokens in the input.
3. The selected Longformer model has a limit of 4096 tokens in the input.</p>
      <p>The second factor our team wanted to investigate was the impact of domain-specific pre-training on
base models and how it afects their ability to generate domain-relevant embeddings. This investigation
was carried out by comparing two types of models:
1. The first group includes base models such as BERT and RoBERTa, which were pre-trained on
large, general-domain corpora.
2. The second group consists of models built upon these base architectures, but further pre-trained
on domain-specific data related to mental health.</p>
      <p>The third factor is to explore the impact of task-specific fine-tuning compared to more general, classic
ifne-tuning approaches. For this purpose, we created two diferent datasets to train and evaluate the
performance of the Transformer-based systems.</p>
      <p>1. Dataset 1. We created only one sample per user by accumulating all their messages for high-risk
and low-risk labeled users.
2. Datasets 2. If we had prior knowledge of the point at which a user begins to exhibit symptoms
indicating a risk of mental illness, we could label all earlier messages as low-risk and the message
containing the onset of symptoms and all subsequent messages as high-risk. This approach would
allow us to increase the number of high-risk samples, potentially leading to a more accurate
model. The data augmentation process used to implement this idea is described below. Using this
technique, we obtain a dataset for each input length, resulting in two datasets in total (512 and
4096 tokens).</p>
      <p>To carry out our experimentation, we divided the original dataset into training (80% of users) and
evaluation (20% of users), maintaining the proportions of positive (high-risk users) and negative
(lowrisk users) samples in each partition. Table 3 shows the distribution of samples in Dataset 1 for Task
1.</p>
      <p>Following the same procedure as in Task 1, we divided the corpus into two partitions: training
(80%) and evaluation (20%), while preserving the class distribution in each partition. Table 4 shows the
distribution of classes across both partitions.</p>
      <sec id="sec-3-1">
        <title>Betting</title>
      </sec>
      <sec id="sec-3-2">
        <title>Online Gaming</title>
      </sec>
      <sec id="sec-3-3">
        <title>Trading and Crypto</title>
      </sec>
      <sec id="sec-3-4">
        <title>Loot Boxes</title>
      </sec>
      <sec id="sec-3-5">
        <title>Total</title>
        <sec id="sec-3-5-1">
          <title>3.1. Classical Machine Learning Classifier Approach</title>
          <p>To evaluate the importance of context, we used a classical machine learning classifier capable of
processing the full input context. One of the main limitations of Transformer-based models is their
dificulty in handling long texts due to restrictions on input size. This constraint can negatively impact
the classification performance, as the input may not capture the entire sample, potentially losing
valuable information.</p>
          <p>
            First, we compared several classical machine learning classifiers. For this purpose, we used the
Scikitlearn library [
            <xref ref-type="bibr" rid="ref8">8</xref>
            ], which provides a wide range of tools to support our experimentation. All classifiers
were used with their default parameters to ensure a fair comparison, and no data preprocessing or
prior analysis was applied. For feature extraction, we employed the TF-IDF method from the
scikitlearn library, which generates a vector of the vocabulary size. The configuration used in this case
corresponded to the same selected in the previous year’s competition [
            <xref ref-type="bibr" rid="ref9">9</xref>
            ], which achieved the best
performance in the experiments conducted. Configuration: "char_wb" , 4-5 n-gram
          </p>
        </sec>
      </sec>
      <sec id="sec-3-6">
        <title>Linear SVM</title>
      </sec>
      <sec id="sec-3-7">
        <title>Gradient Boosting</title>
      </sec>
      <sec id="sec-3-8">
        <title>K-Neighboors</title>
      </sec>
      <sec id="sec-3-9">
        <title>Random Forest</title>
      </sec>
      <sec id="sec-3-10">
        <title>Macro-P 0.69</title>
        <p>0.60
0.65
0.62</p>
      </sec>
      <sec id="sec-3-11">
        <title>Macro-R 0.68</title>
        <p>0.60
0.65
0.58</p>
      </sec>
      <sec id="sec-3-12">
        <title>Macro-F1 0.68</title>
        <p>0.60
0.65
0.59</p>
        <p>
          Table 5 shows the performance of the four classical approaches evaluated. The best-performing
classifier was the Linear SVM on Precision and Recall, which is also reflected in a higher F1-score than
the rest of the approaches. Since the training samples were the same for both tasks, we assumed that
the SVM approach would obtain the best performance for Task 2; therefore, we chose this approach for
both tasks. In addition to selecting the classification algorithm, we explored diferent preprocessing
methods and incorporated additional information for each message:
• Preprocess of Data:
1. TweetTokenizer and stop words removal: The text is tokenized using the TweetTokenizer,
followed by removing stop words.
2. TweeTokenizer, cleaning and lematization: This builds on the first approach by including
additional preprocessing steps, such as cleaning the text, removing non-alphanumeric
characters, and applying token lemmatization.
• Sentimental Analysis: We employed the Transformer-based model
"lxyuan/distilbert-basemultilingual-cased-sentiments-student" [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ] to perform sentiment analysis on each message
per user. The model outputs three sentiment categories: positive, negative, and neutral. These
outputs were normalized and incorporated as an additional feature along the TF-IDF
representation.
        </p>
        <p>To identify the optimal parameters for each model, we conducted an exhaustive search using
GridSearch, a tool provided by Scikit-learn. The search was performed over the parameters C, tol, and
loss. The possible values are:
• C: [1−1 , 1, 10, 100]
• tol: [0.1, 0.01, 0.001, 0.0001]
• loss: [hinge, squared_hinge]</p>
      </sec>
      <sec id="sec-3-13">
        <title>Task 1</title>
        <p>In total, we obtained four diferent configurations for experimentation. Table 6 presents these
configurations, with the optimal parameters for each one listed in the Best Parameters column.</p>
        <p>SVM-t1-1
SVM-t1-2
SVM-t1-3
SVM-t1-4</p>
        <p>Table 7 shows the results on the evaluation partition. The best configuration was SVM-t1-4,
corresponding to a Linear SVM combined with sentiment analysis and the second preprocessing approach
that includes thorough data cleaning and preprocessing.</p>
        <p>SVM-t1-1
SVM-t1-2
SVM-t1-3
SVM-t1-4</p>
      </sec>
      <sec id="sec-3-14">
        <title>Macro-P</title>
        <p>0.68
0.68
0.70
0.71</p>
      </sec>
      <sec id="sec-3-15">
        <title>Macro-R</title>
        <p>0.68
0.68
0.70
0.71
For Task 2, we performed the same hyperparameter optimization as in Task 1. The best values for the
hyperparameters for each combination are shown in Table 8.</p>
        <p>SVM-t2-1
SVM-t2-2
SVM-t2-3
SVM-t2-4</p>
        <p>Table 9 shows the results obtained by each combination for Task 2 in evaluation. It can be noticed
that all the combinations achieved a perfect score (F1-score of 1). However, considering that all systems
achieved a perfect score, we take into account the preprocessing cost. In this regard, we assume
SVMt2-1 as a more favorable system due to its lower computational and preprocessing demands, as it does
not require sentiment analysis or extensive data preprocessing.</p>
        <p>SVM-t2-1
SVM-t2-2
SVM-t2-3
SVM-t2-4</p>
        <sec id="sec-3-15-1">
          <title>3.2. Straighforward Fine-tuning Approach</title>
          <p>
            It is well-known that the Transformer architecture is the state-of-the-art of language models. In this
shared task, we used two Transformer-based architectures: RoBERTa and Longformer [
            <xref ref-type="bibr" rid="ref5 ref6">5, 6</xref>
            ].
• RoBERTa: RoBERTa generally ofers strong versatility and performance for classification tasks.
          </p>
          <p>However, these types of models present the inability to process input longer sequences; usually,
a maximum of 512 tokens. This limitation poses a challenge for tasks involving long contexts,
such as those addressed in this work. Therefore, we used a RoBERTa as a baseline to compare it
against other models that can handle longer input sequences.
• Longformer: Longformer, short for "Long-Document Transformer," was specifically designed to
eficiently process extended input sequences, making it more suitable than standard Transformer
models such as BERT or RoBERTa for tasks involving long contexts. This architecture presents
the following key features:
– New attention mechanism: Longformer uses a sliding window attention mechanism,
where each token attends only to a fixed number of neighboring tokens. This significantly
reduces computational complexity compared to full self-attention.
– Global attention mechanism: The model allows specific tokens to receive global attention,
enabling them to attend to all other tokens in the sequence, while the rest remain in a local
attention scope. This hybrid approach balances eficiency and contextual understanding.</p>
          <p>
            One of the objectives of this model was to compare base models trained on general-domain data with
models that had undergone additional training specifically in the mental health domain. To this end,
we selected the following models:
1. General-domain models: We selected the models PlanTL-GOB-ES/RoBERTa-large-bne and
PlanTL-GOB-ES/Longformer-base-4096-bne-es [
            <xref ref-type="bibr" rid="ref11">11</xref>
            ], developed by the Spanish government. The
RoBERTa model is based on the original RoBERTa architecture and was pre-trained on the largest
Spanish-language corpus, composed of texts from the National Library of Spain. The Longformer
model adapts this RoBERTa model to the Longformer architecture, enabling the processing of
longer input sequences. These models can be found in the Hugging Face repository [
            <xref ref-type="bibr" rid="ref12">12</xref>
            ].
2. Specific-domain models: Our team developed three specific models; two based on Longformer
architectures and one built upon the RoBERTa architecture. These models were trained to generate
contextual embeddings tailored to the mental health domain. For training, we used the Suicidal
and Mental Health (SWMH) corpus [
            <xref ref-type="bibr" rid="ref13">13</xref>
            ], which contains texts related to a range of mental health
disorders. We did not train the models from scratch; instead, we continued their pretraining to
adapt them specifically to the mental health domain, as recommended by other research [
            <xref ref-type="bibr" rid="ref14">14</xref>
            ].
          </p>
          <p>For each base model, we performed straightforward fine-tuning on each of them using Dataset
1. In this fine-tuning, we adjusted the models with the full context available for each user. This
experimentation served as a baseline for comparing a more general approach with a more task-specific
one. Table 10 shows the configuration used in the fine-tuning process.</p>
          <p>Parameter
optimizer
learning rate
lr scheduler type
weight decay
number of epochs
training batch size</p>
        </sec>
      </sec>
      <sec id="sec-3-16">
        <title>Value</title>
        <p>AdamW
3e-5
linear
0.01
10
16</p>
      </sec>
      <sec id="sec-3-17">
        <title>Task 1</title>
        <p>• RoBERTa-t1-g: Modelo PlanTL-GOB-ES/RoBERTa-large-bne
• RoBERTa-t1-s: RoBERTa model pre-trained on domain-specific mental health data.
• Longformer-t1-g: Modelo PlanTL-GOB-ES/Longformer-base-4096-bne-es
• Longformer-t1-s: Longformer model pre-trained on domain-specific mental health data.</p>
        <p>The Longformer model, which was pre-trained with domain-specific mental health data, achieved
the best results due to its ability to handle long texts and specialized pre-training. The results show that
pre-training on domain-relevant data improved the model’s understanding of specialized language and
enhanced its adaptation to the specific tasks.</p>
      </sec>
      <sec id="sec-3-18">
        <title>RoBERTa-t1-g</title>
      </sec>
      <sec id="sec-3-19">
        <title>RoBERTa-t1-s</title>
      </sec>
      <sec id="sec-3-20">
        <title>Longformer-t1-g</title>
      </sec>
      <sec id="sec-3-21">
        <title>Longformer-t1-s</title>
      </sec>
      <sec id="sec-3-22">
        <title>Macro-P</title>
        <p>0.592
0.638
0.638
0.652</p>
      </sec>
      <sec id="sec-3-23">
        <title>Macro-R</title>
        <p>0.601
0.639
0.633
0.650
We used the same models as in Task 1, but using a diferent dataset. In this case, all the models obtained
the highest score, i.e., correctly predicting all the samples. The results obtained by the models on the
evaluation partition are shown in Table 12.</p>
      </sec>
      <sec id="sec-3-24">
        <title>RoBERTa-t2-g</title>
      </sec>
      <sec id="sec-3-25">
        <title>RoBERTa-t2-s</title>
      </sec>
      <sec id="sec-3-26">
        <title>Longformer-t2-g</title>
      </sec>
      <sec id="sec-3-27">
        <title>Longformer-t2-s Macro-P</title>
        <sec id="sec-3-27-1">
          <title>3.3. Task Adaptive Fine-tuning</title>
          <p>
            The third objective of our work was to investigate the impact of task-specific training, specifically
tailored to an early detection scenario. To achieve this, we applied a data augmentation process to adapt
the original dataset to the requirements of early detection. This process followed the same methodology
employed in previous editions [
            <xref ref-type="bibr" rid="ref9">9</xref>
            ].
          </p>
          <p>The data augmentation strategy aimed to generate additional samples for each positive user. This
process involved identifying the specific message in which a user began to exhibit signs of a mental
health disorder. Once the critical message was determined, each possible concatenation of the user’s
message history was labeled as low-risk if it occurred before the selected message, or high-risk otherwise.</p>
          <p>To detect the critical message, we conducted a series of experiments in which we trained and evaluated
various models to detect this critical message.</p>
        </sec>
      </sec>
      <sec id="sec-3-28">
        <title>3.3.1. Best model for data augmentation</title>
        <p>The experimentation phase aimed to identify the model with the highest performance in detecting
the critical message. To this end, we replicated the inference procedure proposed by the competition:
each model received input batches containing one message per user. The models were then required to
predict the correct labels as early as possible following this procedure. For the experimentation, we
used the complete dataset to identify the critical message for each positive user.</p>
        <p>The models proposed for this analysis were Longformer-t1-s, RoBERTa-t1-s, and SVM-t1-4, as
they achieved the best performance during the training phase. Table 13 presents the experimentation
results, where we can observe that Longformer-t1-s achieved the best performance in classifying
users using an early detection strategy, so this was the selected model.</p>
        <p>SVM-t1-4</p>
      </sec>
      <sec id="sec-3-29">
        <title>RoBERTa-t1-s</title>
      </sec>
      <sec id="sec-3-30">
        <title>Longformer-t1-s</title>
      </sec>
      <sec id="sec-3-31">
        <title>Macro-P</title>
        <p>0.77
0.85
0.88</p>
      </sec>
      <sec id="sec-3-32">
        <title>Macro-R</title>
        <p>0.69
0.77
0.79</p>
      </sec>
      <sec id="sec-3-33">
        <title>Macro-F1</title>
        <p>0.70
0.78
0.81</p>
        <p>This technique results in a new dataset with a higher number of positive samples for training. Table
14 shows the two datasets created using the data augmentation process described above, one for each
input length: 512 tokens for RoBERTa and 4096 tokens for Longformer. Dataset 2 refers to the one
created with a maximum message length of 512, while Dataset 3 refers to the one with a maximum
length of 4096.</p>
      </sec>
      <sec id="sec-3-34">
        <title>Original</title>
      </sec>
      <sec id="sec-3-35">
        <title>Dataset 2</title>
      </sec>
      <sec id="sec-3-36">
        <title>Dataset 3</title>
        <p>For this task, we did not submit any system using this approach for several reasons:
1. The first reason was that this task does not involve early detection; therefore, adapting our models
to label the users as early as possible is unnecessary. Instead, we decided that the optimal strategy
was to wait until the maximum amount of context was available before making an inference,
corresponding to fine-tuning using all the samples per user.
2. The second reason was related to the nature of the task itself. Since all users exhibit some form
of addiction, there was no critical message that marks a transition from a non-addicted to an
addicted state.
3. The third reason was that the models previously tested had already demonstrated strong
performance on this task. Consequently, and considering time constraints, we opted to limit further
experimentation.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Runs</title>
      <sec id="sec-4-1">
        <title>4.1. Run Configuration</title>
        <p>In addition to selecting the model for each run, the classification systems required configuring additional
parameters.</p>
        <p>Run0
Run1
Run2
Run0
Run1
Run2
Task2:</p>
        <p>The second task only considers the most recent message sent by each user. However, since we
could not predict when a user would stop receiving new information, we chose to send new predictions
at each round.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. Results</title>
      <p>5.1. Task 1
hypotheses: task-specific pretraining and tailored fine-tuning contribute to improved model
performance.
• Although the top-performing runs employed Transformer-based models, the SVM run achieved a
comparable result, only 0.1% lower than Run2 and 1% less than Run1. This suggests that classical
approaches such as SVMs remain efective for detecting mental health conditions, owing to their
capacity to manage extensive contextual information. Consequently, SVMs are a suitable option
in scenarios with limited computational resources.
5.2. Task 2</p>
      <p>As shown in Table 18, the results obtained by all runs are highly competitive, although not as strong
as with the development split. These results further reinforce the previously discussed conclusions,
supporting our initial hypotheses.</p>
      <sec id="sec-5-1">
        <title>5.3. Carbon emission</title>
        <p>One of the primary objectives of the competition is to identify systems capable of completing tasks
with minimal resource consumption. This will assist in identifying technologies that can operate on
mobile devices or personal computers, as well as those with the lowest carbon emissions. Consequently,
we provide the following information:
• Total time to process (in milliseconds)
• Kg in CO2 emissions.</p>
        <p>
          Using the provided script, which leverages the CodeCarbon API [
          <xref ref-type="bibr" rid="ref16">16</xref>
          ] to calculate emissions, we
present our team’s computer configuration in Table 19. This table outlines the types and quantities of
CPUs and GPUs utilized and the total amount of RAM employed. The results for the Longformer-t1-t
Run 1 are also presented.
        </p>
        <sec id="sec-5-1-1">
          <title>Measurements</title>
          <p>CPU_Count
GPU_Count
CPU_Model
GPU_Model
RAM_Total_Size
Country_ISO_Code</p>
        </sec>
        <sec id="sec-5-1-2">
          <title>Values</title>
          <p>24
1
12th Gen Intel(R) Core(TM) i9-12900K</p>
          <p>NVIDIA GeForce RTX 4090
128 GB</p>
          <p>ESP</p>
          <p>Figure 1 presents the variation in emissions and duration observed throughout the experimentation
process. A clear correlation between these metrics indicates that rounds of longer duration resulted
in higher CO2 emissions. Given that all rounds employed identical models and configurations, the
(a) Emissions of CO2 (Kg) of each round
(b) Duration (milliseconds) of each round
primary factors influencing emissions were the duration of each round and the cumulative context
associated with the user.</p>
          <p>Figure 2 illustrates the cumulative energy consumption of each component. The GPU emerges as the
predominant energy consumer, representing approximately 96% of total energy usage. The RAM follows
with a consumption of 2.5%, while the CPU contributes only 0.2% to the overall energy consumption.</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>6. Task 2 Analysis</title>
      <p>After analyzing the results obtained in Task 2, we identified a significant discrepancy between the
evaluation and test results. Moreover, all models achieved perfect scores during the evaluation phase,
which raised further concerns. These observations prompted us to examine the data provided for this
task more closely and explore potential solutions.</p>
      <p>Our first step was to verify the Train and Evaluation splits for potential errors. However, we
consistently obtained the same results after checking the class distributions in each partition and
conducting training with diferent splits. Having ruled out any issues with the data partitioning, we
proceeded to analyze the provided dataset in greater detail. Specifically, we conducted a class-by-class
analysis to identify any possible anomalies or patterns. The study focused on examining the message
length, in terms of tokens, for each class.</p>
      <p>Figure 3 shows the density curves for each class in each data partition. We observe a notable disparity
between classes: two contain significantly shorter texts than the others. If we sort the classes by text
length, the resulting ranking from shortest to longest is as follows: (1) Lootboxes, (2) Online Gaming, (3)
Trading and Crypto, (4) Betting. When examining each class individually, we observe that the Lootboxes
class exhibits a significantly higher density than the others, with most of its texts concentrated around
30 tokens. Subsequently, the Online Gaming class shows a lower density; however, most of its texts still
fall within the range of 100 to 150 tokens. Lastly, the last two classes display similar density curves,
with no predominant text length. Instead, the texts are distributed over a broad range, approximately
from 300 to 1500 tokens.</p>
      <p>When focusing on partition-based analysis, we can observe that previously described patterns are
consistent across partitions. Although there is a slight variation between the Trading and Crypto and
Betting classes, where the density curves are not as similar as before, the overall trend remains: the
curves in both partitions are broadly comparable.</p>
      <p>For additional statistics related to the class analysis, please refer to Appendix A, specifically Table 20.
betting
lootboxes
onlinegaming
trading</p>
      <p>All these observations have led us to hypothesize that the diferent models may be leveraging message
length as a feature when classifying samples during inference. We are not suggesting that it is the
primary factor driving classification decisions, but rather that it is a highly influential one. This is
especially relevant given that there is a clear distinction in message lengths across classes, and this
pattern is also preserved in both the training and evaluation partitions. As a result, models may
inadvertently rely on this feature during training, as it is not penalized; on the contrary, it may even be
reinforced.</p>
      <p>To test this hypothesis, we conducted the following experiment: we simulated the competition setup
where, in each round, one message per user was provided. However, we supplied only 10 words per
user instead of the whole message. This allowed us to evaluate the models’ predictions under conditions
of reduced text length and assess the extent to which their performance depends on message length.</p>
      <p>Figure 4 shows the distribution of predictions by class and by each model submitted to the task. We
observe that models based on Transformer architectures frequently predict the lootboxes class when
provided with messages containing very little contextual information, even though this class presents
the smallest number of samples. Coincidentally, lootboxes is also the class with the shortest messages
overall. Furthermore, as the size of the available context increases (number of words already received),
the prediction frequencies become more balanced, leading to improved performance.</p>
      <p>In contrast, the SVM model is less sensitive to message length and relies more heavily on the
vocabulary. This is evident because it does not initially overpredict lootboxes, but instead shows a more
diverse distribution of predictions from the start. Nevertheless, we still observe that increasing the
number of tokens leads to a more balanced prediction distribution and, consequently, better performance.
This could be related to the TF-IDF extraction feature, which was fitted with all the user messages
treated as a single document (the messages were concatenated).</p>
      <p>The experiment supports our hypothesis that the models have leveraged this unintended feature for
classification. This reliance can lead the models to make errors during testing if the test partition does
not follow the same distribution as the training and evaluation partitions.
betting
onlinegaming
trading
lootboxes
betting
onlinegaming
trading
lootboxes
(a) Longformer model
(b) RoBERTa model
betting
onlinegaming
trading
lootboxes
10 20 30 40 50 60 70 80 90 00 01 021 031 401 150 160 170 180 190 200</p>
      <p>Number1of1words
(c) SVM model</p>
    </sec>
    <sec id="sec-7">
      <title>7. Conclusion</title>
      <p>In this paper, we have presented the participation of the ELiRF-VRAIN team in the shared tasks of
MentalRiskES at IberLef 2025. Besides evaluating traditional classification models and cutting-edge
Transformer models, our team’s most innovative contribution was the use of Longformer models to
broaden the context for decision-making, leveraging pre-trained models specifically designed for the
mental health domain, and introducing a new data augmentation technique that customizes model
training to the specific task at hand.</p>
      <p>The highly competitive results support our proposal’s validity, demonstrating the performance of
models tailored explicitly for the task at hand.</p>
      <p>For future work, two areas of improvement are identified. Firstly, we aim to enhance early detection so
that the system requires less initial context to make accurate decisions. Secondly, we plan to incorporate
Explainable Artificial Intelligence (XAI) techniques to understand the system’s behavior better. Lastly,
we aim to introduce new analyses and explore alternative training techniques to avoid that systems
take into account the length of the input as a classification feature.</p>
    </sec>
    <sec id="sec-8">
      <title>Acknowledgments</title>
      <p>This work is partially supported by MCIN/AEI/10.13039/501100011033 and "ERDF A way of making
Europe" under grant PID2021-126061OB-C41. Partially supported by the Vicerrectorado de Investigación
de la Universitat Politècnica de València PAID-01-23. It is also partially supported by the Spanish
Ministerio de Universidades under the grant FPU21/05288 for university teacher training and by the
Generalitat Valenciana under CIPROM/2021/023 project.</p>
    </sec>
    <sec id="sec-9">
      <title>Declaration on Generative AI</title>
      <p>During the preparation of this work, the author(s) used ChatGPT and Grammarly in order to: Grammar
and spelling check, Paraphrase, translate, and reword. After using this tool/service, the author(s)
reviewed and edited the content as needed and take(s) full responsibility for the publication’s content.</p>
    </sec>
    <sec id="sec-10">
      <title>A. Statistics of Number of Total Number of Words</title>
      <p>13.74
55.99
288.96
292.36</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>World</given-names>
            <surname>Health</surname>
          </string-name>
          <string-name>
            <surname>Organization</surname>
          </string-name>
          , Mental disorders,
          <year>2022</year>
          . URL: https://www.who.int/news-room/ fact-sheets/detail/mental-disorders, accessed:
          <fpage>2024</fpage>
          -05-15.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>A. M.</given-names>
            <surname>Mármol-Romero</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Álvarez Ojeda</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Moreno-Muñoz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F. M. P.</given-names>
            <surname>del Arco</surname>
          </string-name>
          ,
          <string-name>
            <surname>M. D. MolinaGonzález</surname>
          </string-name>
          ,
          <string-name>
            <surname>M.-T.</surname>
            Martín-Valdivia,
            <given-names>L. A.</given-names>
          </string-name>
          <string-name>
            <surname>Ureña-López</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Montejo-Ráez</surname>
          </string-name>
          , Overview of MentalRiskES at IberLEF 2025:
          <article-title>Early Detection of Mental Disorders Risk in Spanish</article-title>
          ,
          <source>Procesamiento del Lenguaje Natural</source>
          <volume>75</volume>
          (
          <year>2025</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>J.</given-names>
            <surname>Á</surname>
          </string-name>
          .
          <string-name>
            <surname>González-Barba</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          <string-name>
            <surname>Chiruzzo</surname>
            ,
            <given-names>S. M.</given-names>
          </string-name>
          <string-name>
            <surname>Jiménez-Zafra</surname>
          </string-name>
          ,
          <article-title>Overview of IberLEF 2025: Natural Language Processing Challenges for Spanish and other Iberian Languages, in: Proceedings of the Iberian Languages Evaluation Forum (IberLEF 2025), co-located with the 41st Conference of the Spanish Society for Natural Language Processing (SEPLN 2025), CEUR-WS</article-title>
          . org,
          <year>2025</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>A.</given-names>
            <surname>Vaswani</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Shazeer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Parmar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Uszkoreit</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Jones</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. N.</given-names>
            <surname>Gomez</surname>
          </string-name>
          , Ł. Kaiser,
          <string-name>
            <surname>I. Polosukhin</surname>
          </string-name>
          , Attention is All You Need,
          <source>Advances in Neural Information Processing Systems</source>
          <volume>30</volume>
          (
          <year>2017</year>
          ). URL: https://arxiv.org/abs/1706.03762, accessed:
          <fpage>2024</fpage>
          -05-15.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Ott</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Goyal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Du</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Joshi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Levy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Lewis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Zettlemoyer</surname>
          </string-name>
          , V. Stoyanov,
          <article-title>RoBERTa: A Robustly Optimized BERT Pretraining Approach</article-title>
          , arXiv preprint arXiv:
          <year>1907</year>
          .
          <volume>11692</volume>
          (
          <year>2019</year>
          ). URL: https://arxiv.org/abs/
          <year>1907</year>
          .11692.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>I.</given-names>
            <surname>Beltagy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. E.</given-names>
            <surname>Peters</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Cohan</surname>
          </string-name>
          ,
          <article-title>Longformer: The Long-Document Transformer</article-title>
          , arXiv preprint arXiv:
          <year>2004</year>
          .
          <volume>05150</volume>
          (
          <year>2020</year>
          ). URL: https://arxiv.org/abs/
          <year>2004</year>
          .05150.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>P.</given-names>
            <surname>Álvarez-Ojeda</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. V.</given-names>
            <surname>Cantero-Romero</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Semikozova</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Montejo-Ráez</surname>
          </string-name>
          ,
          <article-title>The PRECOM-SM Corpus: Gambling in Spanish Social Media</article-title>
          ,
          <source>in: Proceedings of the 31st International Conference on Computational Linguistics</source>
          ,
          <year>2025</year>
          , pp.
          <fpage>17</fpage>
          -
          <lpage>28</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>F.</given-names>
            <surname>Pedregosa</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Varoquaux</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Gramfort</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Michel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Thirion</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Grisel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Blondel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Prettenhofer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Weiss</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Dubourg</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Vanderplas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Passos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Cournapeau</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Brucher</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Perrot</surname>
          </string-name>
          , É. Duchesnay,
          <article-title>Scikit-learn: Machine Learning in Python</article-title>
          ,
          <source>Journal of Machine Learning Research</source>
          <volume>12</volume>
          (
          <year>2011</year>
          )
          <fpage>2825</fpage>
          -
          <lpage>2830</lpage>
          . URL: https://jmlr.org/papers/v12/pedregosa11a.html.
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>A.</given-names>
            <surname>Casamayor</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Ahuir</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Molina</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.-F.</given-names>
            <surname>Hurtado</surname>
          </string-name>
          , ELiRF-VRAIN at MentalRiskES 2024:
          <article-title>Using LongFormer for Early Detection of Mental Disorders Risk, in: Proceedings of the Iberian Languages Evaluation Forum (IberLEF 2024) co-located with 40th International Conference of the Spanish Society for Natural Language Processing (SEPLN</article-title>
          <year>2024</year>
          ), volume
          <volume>3756</volume>
          <source>of CEUR Workshop Proceedings, CEUR-WS.org</source>
          ,
          <year>2024</year>
          , pp.
          <fpage>24</fpage>
          -
          <lpage>33</lpage>
          . URL: https://ceur-ws.
          <source>org/</source>
          Vol-
          <volume>3756</volume>
          /MentalRiskES2024_paper3.pdf.
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>L. X.</given-names>
            <surname>Yuan</surname>
          </string-name>
          , distilbert
          <article-title>-base-multilingual-cased-sentiments-student (</article-title>
          <source>revision 2e33845)</source>
          ,
          <year>2023</year>
          . URL: https://huggingface.co/lxyuan/distilbert-base
          <article-title>-multilingual-cased-sentiments-student</article-title>
          .
          <source>doi:10</source>
          .57967/hf/1422.
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>A. G.</given-names>
            <surname>Fandiño</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. A.</given-names>
            <surname>Estapé</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Pàmies</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. L.</given-names>
            <surname>Palao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. S.</given-names>
            <surname>Ocampo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C. P.</given-names>
            <surname>Carrino</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C. A.</given-names>
            <surname>Oller</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C. R.</given-names>
            <surname>Penagos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. G.</given-names>
            <surname>Agirre</surname>
          </string-name>
          , M. Villegas,
          <source>MarIA: Spanish Language Models, Procesamiento del Lenguaje Natural</source>
          <volume>68</volume>
          (
          <year>2022</year>
          ). URL: https://upcommons.upc.edu/handle/2117/367156#.YyMTB4X9A-0. mendeley. doi:
          <volume>10</volume>
          .26342/2022-68-3.
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>T.</given-names>
            <surname>Wolf</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Debut</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Sanh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Chaumond</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Delangue</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Moi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Cistac</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Rault</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Louf</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Funtowicz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Davison</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Shleifer</surname>
          </string-name>
          , P. von Platen, C. Ma,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Jernite</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Plu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Xu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T. L.</given-names>
            <surname>Scao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Gugger</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Drame</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Q.</given-names>
            <surname>Lhoest</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. M.</given-names>
            <surname>Rush</surname>
          </string-name>
          , Transformers:
          <article-title>State-of-the-</article-title>
          <source>Art Natural Language Processing</source>
          ,
          <year>2020</year>
          . URL: https://arxiv.org/abs/
          <year>1910</year>
          .03771. arXiv:
          <year>1910</year>
          .03771.
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>S.</given-names>
            <surname>Ji</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Huang</surname>
          </string-name>
          , E. Cambria,
          <article-title>Suicidal ideation and mental disorder detection with attentive relation networks</article-title>
          ,
          <source>Neural Computing and Applications</source>
          (
          <year>2021</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>S.</given-names>
            <surname>Gururangan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Marasovic</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Swayamdipta</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Lo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>I.</given-names>
            <surname>Beltagy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Downey</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N. A.</given-names>
            <surname>Smith</surname>
          </string-name>
          ,
          <string-name>
            <surname>Don't Stop</surname>
          </string-name>
          <article-title>Pretraining: Adapt Language Models to Domains and Tasks</article-title>
          , CoRR abs/
          <year>2004</year>
          .10964 (
          <year>2020</year>
          ). URL: https://arxiv.org/abs/
          <year>2004</year>
          .10964. arXiv:
          <year>2004</year>
          .10964.
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>J.</given-names>
            <surname>Cañete</surname>
          </string-name>
          , G. Chaperon,
          <string-name>
            <given-names>C.</given-names>
            <surname>Fuentes</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Pérez</surname>
          </string-name>
          ,
          <string-name>
            <surname>B. Poblete,</surname>
          </string-name>
          <article-title>RoBERTuito: A pre-trained language model for social media text in Spanish</article-title>
          , in: Proceedings of the Seventh Workshop on Noisy User-generated
          <string-name>
            <surname>Text (W-NUT</surname>
            <given-names>)</given-names>
          </string-name>
          ,
          <source>Association for Computational Linguistics</source>
          ,
          <year>2022</year>
          , pp.
          <fpage>132</fpage>
          -
          <lpage>140</lpage>
          . URL: https://aclanthology.org/
          <year>2022</year>
          .wnut-
          <volume>1</volume>
          .
          <fpage>14</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <article-title>CodeCarbon, CodeCarbon: Track and Reduce Your Carbon Emissions from Machine Learning Workloads</article-title>
          , https://mlco2.github.io/codecarbon/index.html,
          <year>2024</year>
          . Accessed:
          <fpage>2024</fpage>
          -05-15.
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>