<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Tackling a Challenging Corpus for Early Detection of Gambling Disorder: UNSL at MentalRiskES 2025</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Horacio Thompson</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Marcelo Errecalde</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET)</institution>
          ,
          <addr-line>San Luis</addr-line>
          ,
          <country country="AR">Argentina</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Universidad Nacional de San Luis (UNSL)</institution>
          ,
          <addr-line>Ejército de Los Andes 950, San Luis, C.P. 5700</addr-line>
          ,
          <country country="AR">Argentina</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2025</year>
      </pub-date>
      <abstract>
        <p>Gambling disorder is a complex behavioral addiction that is challenging to understand and address, with severe physical, psychological, and social consequences. Early Risk Detection (ERD) on the Web has become a key task in the scientific community for identifying early signs of mental health behaviors based on social media activity. This work presents our participation in the MentalRiskES 2025 challenge, specifically in Task 1, aimed at classifying users at high or low risk of developing a gambling-related disorder. We proposed three methods based on a CPI+DMC approach, addressing predictive efectiveness and decision-making speed as independent objectives. The components were implemented using the SS3, BERT with extended vocabulary, and SBERT models, followed by decision policies based on historical user analysis. Although it was a challenging corpus, two of our proposals achieved the top two positions in the oficial results, performing notably in decision metrics. Further analysis revealed some dificulty in distinguishing between users at high and low risk, reinforcing the need to explore strategies to improve data interpretation and quality, and to promote more transparent and reliable ERD systems for mental disorders.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Early Risk Detection</kwd>
        <kwd>SS3</kwd>
        <kwd>BERT</kwd>
        <kwd>Sentence-BERT</kwd>
        <kwd>Decision Policy</kwd>
        <kwd>Mental Health</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        According to the World Health Organization, an estimated 1.2% of the adult population sufers from
a gambling disorder, with the risk growing even for young people and children. In recent years, this
disorder has been recognized as a behavioral addiction, prompting the Diagnostic and Statistical Manual
of Mental Disorders (DSM-5) and the International Classification of Diseases (ICD-11) to reclassify
it alongside substance-related disorders [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], replacing the term pathological gambling with gambling
disorder. This condition encompasses a wide spectrum of physical, psychological, and social
consequences [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], including high substance use [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], symptoms of anxiety, depression, stress, and impulsivity
[
        <xref ref-type="bibr" rid="ref4 ref5 ref6">4, 5, 6</xref>
        ], as well as work-related and financial conflicts, relationship deterioration, and criminal behavior
[
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. Technological advancements and widespread access to digital platforms have contributed to the
increasing prevalence of this disorder [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ], alongside other behaviors such as compulsive shopping and
problematic social media use [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]. Furthermore, numerous studies have highlighted the challenges in
establishing precise criteria and consistent methods for estimating the prevalence of gambling disorder
due to the diversity of assessment tools, risk factors, and controversies surrounding the validity of
diagnostic criteria [
        <xref ref-type="bibr" rid="ref10 ref11 ref12">10, 11, 12</xref>
        ].
      </p>
      <p>
        In this context, Early Risk Detection (ERD) on the Web has become a significant research area in
recent years, aiming to identify users who exhibit signs of developing a mental health condition as early
as possible. Initiatives such as MentalRiskES have fostered research on ERD in Spanish [
        <xref ref-type="bibr" rid="ref13 ref14">13, 14</xref>
        ], while
CLEF eRisk has promoted similar eforts primarily in English [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ]. Our research group has actively
participated in these challenges, addressing the detection of depression and eating disorders in the
Spanish language [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ], as well as the detection of depression [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ], pathological gambling [
        <xref ref-type="bibr" rid="ref18 ref19">18, 19</xref>
        ], and
anorexia [
        <xref ref-type="bibr" rid="ref20">20</xref>
        ] in English. In the MentalRiskES 2025 edition [
        <xref ref-type="bibr" rid="ref21 ref22">21, 22</xref>
        ], a challenge focused on early detection
of gambling disorder was proposed, consisting of two tasks: binary classification (Task 1), aimed at
identifying users at high (positive) or low (negative) risk; and multiclass classification (Task 2), designed
to distinguish the specific type of addiction associated with the disorder, such as Betting, Online Gaming,
Trading, and Lootboxes. The challenge was conducted in two phases: a training stage, using a labeled
corpus provided by the Organizers, and an online evaluation stage, where teams analyzed user posts
progressively while interacting with a server in an early environment. Our team participated in Task 1,
presenting three proposals based on a CPI+DMC approach [23]. This approach conceptualizes ERD as a
multi-objective problem, where the goal is to optimize classification efectiveness and decision-making
speed independently. It consists of two components: a Classification with Partial Information (CPI)
model that processes user content progressively and a policy for Deciding the Moment of Classification
(DMC) that determines when to make a final decision based on the accumulated evidence. While
alternatives exist that simultaneously address both objectives [24], we opted for a modular design due
to the complexity of the problem. For the CPI component, we implemented three diferent models: SS3,
BERT with extended vocabulary, and SBERT. For the DMC component, we designed decision policies
that evaluate users based on historical analysis.
      </p>
      <p>According to the oficial ranking released by the Organizers, two of our proposals achieved first
and second place, with remarkable results in the Macro F1 score and other decision-making metrics.
A detailed analysis of the results highlighted the inherent complexity of the task: the diferentiation
between users at high and low risk is subtle and dificult to define, posing a challenge for both the
proposed models and potential human evaluators. The structure of the paper is as follows: Section 2
presents details of the corpus and a preliminary analysis of the data; Section 3 describes the methodology
adopted and the models used; Section 4 discusses the results obtained; and Section 5 ofers conclusions
and future work.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Corpus</title>
      <p>To address Task 1, the Organizers developed a corpus [25] divided into three parts, deployed in the
diferent phases of the challenge (Table 1). The Train and Trial sets were provided for model training
and server connection testing, while the Test set was reserved for the final evaluation of participating
teams. Each set presents a balanced class distribution, with a similar mean number of posts per user
(around 60) and a relatively short post length, averaging fewer than 10 words per post, although some
posts are considerably longer. Additionally, users come from the Telegram and Twitch platforms, with
a balanced distribution across classes.</p>
      <p>User information is organized at the post level and includes structured metadata such as message ID,
round number, user pseudonym, text content, date, and origin platform. For example, a post from user
subject1 is represented as: {id_message: 123, round: 1, nick: "subject1", message: "...", date: "2021-01-06
04:02:48+01:00", platform: "Telegram"}. Each user was labeled by the Organizers with a binary class
(high or low risk) based on their complete posting history, considering that all users show some level of
gambling behavior, though their risk levels vary.</p>
      <p>We conducted a preliminary exploration of the available training sets (357 users across the Train
and Trial sets), aiming to analyze the user textual content. First, we calculated the cosine similarity
on the TF-IDF representations generated from the complete vocabulary of each class, obtaining a
value of 0.854, indicating a high lexical similarity between positive and negative users. Next, we
used the Jaccard index to assess the lexical overlap between the classes, considering the 1,000 most
frequent words in each. The resulting value was 0.581, corresponding to 735 shared words out of 1,265
unique words, further reinforcing the significant lexical overlap between the two classes. Inspection of
these shared words revealed topics such as cryptocurrencies, financial markets, games, betting, digital
platforms, and various emotional states. The remaining words unique to each class were strongly linked
to these topics, reflecting subtle diferences. For instance, positive users tended to use more technical
and advanced language, referencing specific platforms (e.g., BingX, Winamax, Ledger, Discord, OKEx),
ifnancial market concepts (e.g., Elliot, scalping, velas—candles, store, liquidez—liquidity), and more
active participation in forums and communities. In contrast, negative users displayed less technical
language, with expressions suggesting a more cautious attitude (e.g., aprender—learn, consejo—advice,
ojalá—hopefully, imposible—impossible, paciencia—patience), possibly reflecting less experience with
these topics. As part of the study, we also found that 80% of users made most of their posts during
nighttime hours (between 6 p.m. and 6 a.m.), with this tendency being slightly stronger among positive
users (82%) compared to negative users (78%). It should be noted that the timestamps are in UTC+1
(+01:00), although it is unclear whether this timezone corresponds to the actual location of each user.
After manually inspecting posts, personal contexts, and user dynamics, no clear pattern emerged that
could diferentiate high-risk from low-risk users. Therefore, this exploration suggests that the task is
indeed challenging, which motivated the models proposed by our team.</p>
    </sec>
    <sec id="sec-3">
      <title>3. Methodology</title>
      <p>Given the domain complexity and the high lexical similarity between classes, we explored three distinct
methods, each aimed at capturing diferent levels of representation and implementing alternative early
classification strategies. We designed three proposals that combine diferent classifiers and
decisionmaking strategies in line with the CPI+DMC approach. Below, we describe the models employed and
the experiments conducted for their training.</p>
      <sec id="sec-3-1">
        <title>3.1. Models</title>
        <sec id="sec-3-1-1">
          <title>3.1.1. SS3 Using a Global Decision Policy</title>
          <p>The SS3 model [26] is a supervised classifier created for ERD problems. It is a robust method that enables
incremental user analysis and facilitates the interpretability of the decisions. During the training, SS3
builds a vocabulary with term frequencies for each class. It employs a global value () function that
assigns a score to each term relative to the target classes, considering three parameters:  (smoothness),
 (sanction), and  (significance). The model aims to emulate human behavior by focusing on key terms
when classifying a text, thereby contributing to its interpretability. Internally, it performs a hierarchical
analysis at multiple levels (words, sentences, and paragraphs) and applies summary operators to obtain
a global value of each. Then, the final classification depends on the sum of the  scores of all the
terms in a user’s text. This model constitutes the CPI component of our first proposal, where the
representation of samples is created from a frequentist and interpretable model based on term relevance
without relying on deep learning.</p>
          <p>
            To implement the DMC component, we adopted a global decision policy previously proposed by our
laboratory [
            <xref ref-type="bibr" rid="ref17">17</xref>
            ]. We defined a value score  that estimates a user’s overall risk level based on their post
history and the target classes. During user evaluation, we maintain two confidence values, positive and
negative, which accumulate the  of each observed term across the user’s writing history. At post
round (delay), the user’s current risk level is estimated by normalizing these cumulative values through
score = softmax
︂([ positive negative ]︂)
          </p>
          <p>
            ,
delay delay
positive
This normalization ensures that the resulting scores are within the range [
            <xref ref-type="bibr" rid="ref1">0,1</xref>
            ] and allows a fairer
comparison between users by mitigating the impact of very short or very long posting histories. In
this way, score represents the relative likelihood that the user belongs to the positive class and serves
as the basis for the decision-making process that determines whether the user should be classified as
positive or negative:
decision =
{︃1, if score &gt; median(scores) +  · MAD(scores)
          </p>
          <p>0, otherwise.</p>
          <p>This policy uses a dynamic threshold defined by the median of all users’ scores ( scores = {score| ∈
Users}) and the median absolute deviation (MAD), which together define an uncertainty interval:
median(scores) ±  · MAD(scores) . The hyperparameter  controls how much a user’s score must
deviate from the median to be classified as positive. In other words, a user is considered at risk if their
score is significantly higher than most users’ scores.</p>
        </sec>
        <sec id="sec-3-1-2">
          <title>3.1.2. Extended BERT Using a History-Based Decision Policy</title>
          <p>We used this transformer-based version as a baseline. It involves fine-tuning a pre-trained BERT
model by expanding its original vocabulary, enabling it to represent domain-specific terms previously
unknown to the model and whose semantics can contribute to the classification task. Specifically, we
used BETO [27], a BERT model pre-trained on Spanish corpora. To identify the new vocabulary terms,
we relied on the SS3 model (from our previous proposal) to rank the words according to their relevance
to the positive class, from which we selected those to incorporate into the model. In this way, the CPI
component of our second proposal aims to obtain distributed and contextualized text representations
enriched with domain-relevant terms.</p>
          <p>
            For the DMC component, we employed a decision policy based on the model prediction history
during early user detection, referred to as the history-based rule [
            <xref ref-type="bibr" rid="ref16 ref18 ref19 ref20">18, 19, 16, 20, 24</xref>
            ]. Since transformer
models have limitations on the number of tokens they can process, we used a sliding window that
concatenates the current post with the previous N ones. At each step, the model predicts the current
window and applies the history-based rule:
decision =
{︃1, if ∑︀
          </p>
          <p>=1 I( ≥  ) ≥ 
0, otherwise.</p>
          <p>Where  is the predicted probability at round ,  is the decision threshold,  is the number of
required positive predictions, and I(·) is the indicator function. Then,  and  are hyperparameters
that determine the sensitivity and tolerance of the policy, respectively, and were tuned based on the
model behavior during the early evaluation of the users. To achieve this, we used the mock-server tool1,
which simulates an early detection environment through rounds of posts and response submissions,
enabling the evaluation of the model’s performance through various metrics.</p>
        </sec>
        <sec id="sec-3-1-3">
          <title>3.1.3. SBERT Using a History-Based Decision Policy</title>
          <p>The third variant relied on Sentence-BERT (SBERT) [28], a BERT-based model adapted with a Siamese
architecture to generate dense, sentence-level representations, capturing semantic relationships to
solve tasks, such as classification and semantic search. For the CPI component, we used SetFit
(Sentence Transformer Fine-tuning) [29], an eficient framework designed for few-shot scenarios based on
1Available at: https://github.com/jmloyola/erisk_mock_server
(1)
(2)
(3)
contrastive learning. The SBERT encoder is fine-tuned by automatically generating pairs of examples
(positive and negative) from the original dataset and training the model to produce embeddings that
are closer for examples from the same class and distant from those of diferent classes. This process
results in a discriminative semantic space that supports class separation according to sentence-level
representations obtained by the fine-tuned encoder. Then, an independent classifier is trained on
the resulting embeddings without further modifying the encoder, leading to an eficient and efective
method for classification tasks in complex domains. For DMC, we applied the same history-based rule
(Equation 3), considering the model prediction history to decide when to trigger a risk alert.</p>
        </sec>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. Experiments</title>
        <p>Following the CPI+DMC approach, the experimentation was organized in two stages. In the first stage,
we explored diferent configurations and hyperparameters to find optimal models for user classification.
In the second stage, we evaluated diferent decision-making policies by testing the previously selected
models in an early detection environment using the mock-server tool. The data usage and specific model
configurations adopted by our team are detailed below.</p>
        <sec id="sec-3-2-1">
          <title>3.2.1. Data and preprocessing</title>
          <p>We used the textual content of user posts and discarded metadata such as date and platform. Although
we considered this information, preliminary results showed no performance improvements, and no
substantial patterns were found to justify its use. We merged the Train and Trial sets for the experiments,
resulting in 357 samples: 257 for model training and validation and 100 for evaluation in an early
detection environment while maintaining a balanced distribution between classes. Preprocessing
included converting texts to lowercase, transforming Unicode and HTML sequences into corresponding
symbols, normalizing URLs using the ’weblink’ token, removing repeated words, and applying other
basic text-cleaning operations.</p>
        </sec>
        <sec id="sec-3-2-2">
          <title>3.2.2. Model setup</title>
          <p>UNSL#0. SS3 model trained on character trigrams using the hyperparameters  =0.44,  =0.5, and
 =0.86, selected via grid-search optimized by the F1 metric. A global decision policy was applied,
configured with =0.5.</p>
          <p>UNSL#1. We used the BETO model (checkpoint: dccuchile/bert-base-spanish-wwm-uncased)
and included 25 domain-relevant words extracted from the SS3 model, considering confidence
values assigned to the positive class. This extension allowed us to include terms originally
outside the BETO vocabulary, such as rebote (rebound), combi (combo bets or parlays), divergencia
(divergence), BingX, scalping, and velita (candlestick), among others. The remaining
hyperparameters were: optimizer = AdamW, learning_rate = 5E-5, scheduler = LinearSchedulerWarmup,
batch_size = 32, and n_epochs = 10. The checkpoint with the highest F1 score on the validation
set was selected. We used a history-based rule configured with  =10 and  =0.6.
UNSL#2. SBERT model, based on BETO and pre-trained on semantic similarity tasks in Spanish
(checkpoint: hiiamsid/sentence_similarity_spanish_es). We fine-tuned the encoder using the
CosineSimilarityLoss function, followed by a logistic regression classifier trained on the resulting
embeddings. The configuration included batch_size = 16, num_epochs = 1, num_iteration = 20,
and learning_rate = 2E-5. Finally, we defined a history-based rule configured with  =10 and
 =0.7.</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Results</title>
      <p>
        A total of 38 proposals were submitted by thirteen teams for Task 1. The Organizers evaluated the
models using classification and latency metrics, and published an oficial ranking based on the Macro
F1. Table 2 summarizes the results obtained by our models and compares them with some of the most
relevant proposals (complete oficial results reported in [
        <xref ref-type="bibr" rid="ref21">21</xref>
        ]). We highlight the following observations:
• UNSL#2 reached first place in the overall ranking, achieving a Macro F 1 of 0.567. It also obtained
the second-best results in Accuracy, Macro Recall, Micro Precision, Micro Recall, and Micro F1.
      </p>
      <p>Additionally, it showed acceptable F, considering the top three models.
• UNSL#0 was ranked second, with a Macro F1 of 0.563, and delivered comparable performance to
UNSL#2. It excelled in several metrics, obtaining the best scores in Accuracy, Macro Recall, Micro
Precision, Micro Recall, and Micro F1, as well as the best F among the top-ranked models.
It also achieved an ERDE30 of 0.284, better than the overall average (0.325) and comparable with
the best, PLN-PPM-ISB#0.
• UNSL#1 model achieved 16th with a Macro F1 of 0.444, outperforming both the mean (0.426) and
the median (0.429) of all submissions, and providing an acceptable baseline for comparison.
• Among other proposals, I2C-UHU-Rigel#1 achieved the third-best Macro F1, VerbaNexAI-Lab#0
(27th) the best ERDE5, PLN-PPM-ISB#0 (17th) the best ERDE30 and the second-best F, while
Robertuito (20th) obtained the best Macro Precision and F.
In addition, the Organizers requested that the teams provide estimates of energy consumption and
resource usage to assess the computational and environmental impact using the CodeCarbon library2.
As shown in Table 3, our models were executed on the same hardware configuration, with each inference
taking an average of 2.3 seconds, consuming approximately 7E-05 kWh (kilowatt-hour) of energy, and
producing 1.66E-07 kgCO2eq (kilograms of carbon dioxide equivalent), all values significantly lower
than the recorded mean.</p>
      <sec id="sec-4-1">
        <title>4.1. Error analysis</title>
        <p>2Available at: https://github.com/mlco2/codecarbon
detected the most true positives (TPs), while UNSL#2 identified the most true negatives (TNs). UNSL#2
also had the lowest number of false positives (FPs) and a similar number of false negatives (FNs),
achieving a balance between precision and recall, suggesting a more conservative approach to detecting
positive cases. In contrast, UNSL#0 reduced FNs but increased FPs, reflecting a more sensitive yet less
precise strategy that prioritizes early detection, even at the expense of generating more incorrect alerts.
Meanwhile, UNSL#1 exhibited intermediate performance in FNs, the highest number of FPs, and the
lowest TNs, indicating dificulties distinguishing between the two classes.</p>
        <p>Figure 2 shows the predictions of the three models when analyzing a positive user from Task 1.
Despite following diferent strategies, all three models consistently showed signs that the user was at
high risk throughout the analysis. UNSL#0 predicted the user as positive at round 7, UNSL#1 at round
39, and UNSL#2 at round 29. UNSL#0 exhibits fewer score variations, with confidence values remaining
close to 0.5, and issued an alert after exceeding the uncertainty interval defined by the decision policy.
UNSL#1 displays higher variability, with predictions oscillating between high and low probabilities
across many rounds. In contrast, UNSL#2 initially presents isolated high scores that later stabilize
toward the end of the analysis. Both UNSL#1 and UNSL#2 benefited from the history-based rule, which
allowed them to tolerate fluctuations and wait for consistent signals before issuing a final decision.</p>
        <p>The Venn diagram in Figure 3 illustrates the distribution of positive predictions across the three
models, highlighting their overlaps and divergences. The three models agreed on 57 positive instances
(35 correctly classified), while the remaining 22 were FPs shared by all three. Upon examining these
presumably incorrect samples, we observed recurring themes such as sports betting, video games with
elements of chance, and cryptocurrency trading. These cases often featured behaviors such as financial
speculation, active engagement in games, and intense emotional expressions related to wins and losses.
Only a minority consisted of short or ambiguous messages that required deeper contextual or linguistic
interpretation. This suggests that many of these FPs may be due to the limitations of the corpus in
clearly distinguishing between risk levels. The analysis highlights the intrinsic complexity of the task,
where ambiguity and overlap between users at high and low risk, not only at the lexical level (as noted
in Section 2) but also semantic, can lead to misclassifications by both predictive models and human
evaluators.</p>
        <p>To assess the semantic consistency of the predictions made by the UNSL#2 and UNSL#0 models, we
show some illustrative examples using the following sentences:</p>
        <p>S1: “hoy jugué durante horas en BingX buscando ese rebote que me haría recuperar lo que
perdí ayer... pero me comí terrible divergencia!” (I played for hours today on BingX looking
for that rebound that would make up for what I lost yesterday... but I got caught with a
terrible divergence!).</p>
        <p>S2: “tarde o temprano llegará el gool, pero espero que no sea en el primer tiempo porque entre
con bastante” (Sooner or later the goal will come, but I hope it’s not in the first half because
I’m going in with a lot).</p>
        <p>S3: “he hecho algunas pequeñas inversiones, pero la verdad que no me gustan mucho estas
cosas” (I’ve made some small investments, but I’m not really into this kind of thing).
The UNSL#2 model classified the first two sentences as positive and the third as negative. To assess
the semantic consistency of these predictions, we obtained the embeddings of each sentence using
the UNSL#2 encoder and calculated the cosine similarity between them. We observed that S1 and S2
exhibited high similarity (0.7282), while S3 showed low similarity with both S1 (0.1202) and S2 (0.0065).
This suggests that the model may be constructing a representation space where sentences related to
users with gambling tendencies (e.g., impulsive investments or sports betting) tend to be closer together,
while those without these characteristics are farther apart. Since distinguishing between users at high
and low risk can be ambiguous and challenging, even semantically consistent representations can result
in errors if the corpus does not adequately reflect these diferences. However, the overall performance
of UNSL#2 in detecting the positive class depends on both the learned representation space and the
classifier that utilizes these representations for prediction.</p>
        <p>The UNSL#0 model, based on SS3, facilitates interpretation by providing a tool to visualize the
information it considers relevant for each prediction. For instance, Figure 4 shows that the model
classified sentence S1 as positive, assigning distinct relevance scores to terms such as durante (during),
BingX, rebote (bounce), and divergencia (divergence) for the positive class, while the term jugué (I
played) was slightly associated with the negative class. Additionally, the cumulative  increases as the
sentence progresses, especially after encountering the word BingX, which strongly contributed to the
model’s decision.</p>
      </sec>
      <sec id="sec-4-2">
        <title>4.2. Final Considerations</title>
        <p>The results highlight the inherent complexity of the task, particularly the challenge of distinguishing
between positive and negative classes. In [30], we explored Large Language Models (LLMs) to address
the ERD in depression by introducing a reasoning criterion grounded in expert knowledge. The goal
was not only to detect positive cases, but also to generate explanations that justify the model’s decisions.
While defining such criteria is dificult, this strategy can enhance data construction and interpretation,
allowing for more precise identification of the specific moments when risk signals emerge. It may also
contribute to more reliable evaluation metrics, such as an adaptive version of ERDE , where each user
has an individual threshold  based on their behavior, penalizing delayed decisions more fairly.</p>
        <p>On the other hand, the ERDE metric was defined in [ 31] to penalize delayed TPs, while the cost
associated with FPs and FNs depends on the domain. The authors also point out that negative cases
correspond to non-risk situations, in which early or urgent intervention is not required. However, in
our case, negative users are already at some level of risk, which changes the notion of “early detection” :
it is no longer about identifying an early risk case under the assumption that risk is initially absent,
but rather about recognizing the moment when a user crosses a critical threshold that justifies more
serious concern. This shift in perspective may help explain why most participants underperformed on
early detection metrics, particularly ERDE . Considering that in Task 1 the average number of posts
per user was around 60 (see Table 1), a higher threshold (e.g.,  = 50) could have helped avoid severe
penalties during the initial stages of analysis, when much of the evidence was still unavailable.</p>
        <p>This year’s edition raises relevant conceptual challenges for ERD, including how risk is defined,
how it is measured, and what types of decisions we expect models to make. For instance, an FP made
with very little evidence might represent a more serious false alarm than a late FP, in which the user
already shows ambiguous patterns, something even possibly acceptable from a preventive perspective.
The same holds for FNs: delaying a prediction may be reasonable when signals are weak, but as more
information accumulates over time, the model should be able to detect evident signs of risk.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. Conclusion</title>
      <p>Our laboratory solved Task 1 of MentalRiskES 2025 by presenting three proposals based on the CPI+DMC
approach. Two of these proposals achieved outstanding results among all participating teams,
demonstrating that ERD can be addressed by balancing classification performance and decision-making speed
through a modular and independent approach. Corpus exploration played a crucial role in the selection
of the methods used. The results highlighted the complexity of distinguishing between high-risk and
low-risk users, which can be challenging even from a human perspective. It is essential to continue
researching strategies that enhance the quality and interpretation of data, particularly in the ERD of
mental health, where transparent and reliable systems are needed to support identification and analysis
in critical areas, such as gambling disorder. Furthermore, we will continue exploring new approaches
that address ERD by combining predictive efectiveness and speed as a single combined objective.</p>
    </sec>
    <sec id="sec-6">
      <title>Acknowledgments</title>
      <p>This work is part of the doctoral research of Horacio Thompson, carried out at the Laboratorio de
Investigación y Desarrollo en Inteligencia Computacional (LIDIC), under the project PROICO 03-0620,
Argentina.</p>
    </sec>
    <sec id="sec-7">
      <title>Declaration on Generative AI</title>
      <p>The authors have not employed any Generative AI tools.
[23] J. M. Loyola, M. L. Errecalde, H. J. Escalante, M. Montes y Gomez, Learning When to Classify for
Early Text Classification, in: Computer Science–CACIC 2017: 23rd Argentine Congress, La Plata,
Argentina, October 9-13, 2017, Revised Selected Papers 23, Springer, 2018, pp. 24–34.
[24] H. Thompson, E. Villatoro-Tello, M. Montes-y Gómez, M. Errecalde, Temporal Fine-tuning for
Early Risk Detection, in: Memorias de las JAIIO–Simposio Argentino de Inteligencia Artificial y
Ciencias de Datos (ASAID), volume 10, 2024, pp. 137–149.
[25] P. Álvarez-Ojeda, M. V. Cantero-Romero, A. Semikozova, A. Montejo-Ráez, The PRECOM-SM
Corpus: Gambling in Spanish Social Media, in: Proceedings of the 31st International Conference
on Computational Linguistics, 2025, pp. 17–28.
[26] S. G. Burdisso, M. Errecalde, M. Montes-y Gómez, A Text Classification Framework for Simple and
Efective Early Depression Detection Over Social Media Streams, Expert Systems with Applications
133 (2019) 182–197.
[27] J. Cañete, G. Chaperon, R. Fuentes, J.-H. Ho, H. Kang, J. Pérez, Spanish Pre-trained BERT Model
and Evaluation Data, in: PML4DC at ICLR 2020, 2020.
[28] N. Reimers, I. Gurevych, Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks,
2019. arXiv:1908.10084.
[29] L. Tunstall, N. Reimers, U. E. S. Jo, L. Bates, D. Korat, M. Wasserblat, O. Pereg, Eficient Few-Shot</p>
      <p>Learning Without Prompts, 2022. arXiv:2209.11055.
[30] H. Thompson, M. Sapino, E. Ferretti, M. Errecalde, Hacia la Interpretabilidad de la
Detección Anticipada de Riesgos de Depresión Utilizando Grandes Modelos de Lenguaje, 2025.
arXiv:2503.20939.
[31] D. E. Losada, F. Crestani, A Test Collection for Research on Depression and Language Use, in:
International conference of the cross-language evaluation forum for European languages, Springer,
2016, pp. 28–39.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>H. S.</given-names>
            <surname>Kim</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. C.</given-names>
            <surname>Hodgins</surname>
          </string-name>
          ,
          <article-title>A Review of the Evidence for Considering Gambling Disorder (and Other Behavioral Addictions) as a Disorder Due to Addictive Behaviors in the ICD-11: a Focus on Case-Control Studies</article-title>
          ,
          <source>Current Addiction Reports</source>
          <volume>6</volume>
          (
          <year>2019</year>
          )
          <fpage>273</fpage>
          -
          <lpage>295</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>A.</given-names>
            <surname>Wöhr</surname>
          </string-name>
          ,
          <string-name>
            <surname>M.</surname>
          </string-name>
          <article-title>Wuketich, Perception of Gamblers: A Systematic Review</article-title>
          ,
          <source>Journal of Gambling Studies</source>
          <volume>37</volume>
          (
          <year>2021</year>
          )
          <fpage>795</fpage>
          -
          <lpage>816</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>M.</given-names>
            <surname>Browne</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Rawat</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Newall</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Begg</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Rocklof</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Hing</surname>
          </string-name>
          ,
          <article-title>A Framework for Indirect Elicitation of the Public Health Impact of Gambling Problems</article-title>
          ,
          <source>BMC Public Health</source>
          <volume>20</volume>
          (
          <year>2020</year>
          )
          <fpage>1</fpage>
          -
          <lpage>14</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>A. H.</given-names>
            <surname>Bargeron</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. M.</given-names>
            <surname>Hormes</surname>
          </string-name>
          , Psychosocial Correlates of Internet Gaming Disorder: Psychopathology,
          <string-name>
            <given-names>Life</given-names>
            <surname>Satisfaction</surname>
          </string-name>
          , and
          <string-name>
            <surname>Impulsivity</surname>
          </string-name>
          ,
          <source>Computers in Human Behavior</source>
          <volume>68</volume>
          (
          <year>2017</year>
          )
          <fpage>388</fpage>
          -
          <lpage>394</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <surname>C. De Pasquale</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          <string-name>
            <surname>Sciacca</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          <string-name>
            <surname>Martinelli</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Chiappedi</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Dinaro</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          <string-name>
            <surname>Hichy</surname>
          </string-name>
          ,
          <article-title>Relationship of Internet Gaming Disorder with Psychopathology and Social Adaptation in Italian Young Adults</article-title>
          ,
          <source>International journal of environmental research and public health 17</source>
          (
          <year>2020</year>
          )
          <fpage>8201</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <surname>A. M. Wu</surname>
            ,
            <given-names>J. H.</given-names>
          </string-name>
          <string-name>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <surname>K.-K. Tong</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Yu</surname>
            ,
            <given-names>J. T.</given-names>
          </string-name>
          <string-name>
            <surname>Lau</surname>
          </string-name>
          ,
          <article-title>Prevalence and Associated Factors of Internet Gaming Disorder Among Community Dwelling Adults in Macao, China</article-title>
          ,
          <source>Journal of behavioral addictions 7</source>
          (
          <year>2018</year>
          )
          <fpage>62</fpage>
          -
          <lpage>69</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>M.</given-names>
            <surname>Browne</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Langham</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Rawat</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Greer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Rose</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Rocklof</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Donaldson</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Thorne</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Goodwin</surname>
          </string-name>
          , et al.,
          <article-title>Assessing Gambling-Related Harm in Victoria: A Public Health Perspective</article-title>
          ,
          <source>Technical Report, Victorian Responsible Gambling Foundation</source>
          ,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>S.</given-names>
            <surname>Núñez-Rodríguez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Burgos-González</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L. A.</given-names>
            <surname>Mínguez-Mínguez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Menéndez-Vega</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. L.</given-names>
            <surname>Antoñanzas-Laborda</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. J.</given-names>
            <surname>González-Bernal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>González-Santos</surname>
          </string-name>
          ,
          <article-title>Efectiveness of Therapeutic Interventions in the Treatment of Internet Gaming Disorder: A Systematic Review</article-title>
          ,
          <source>European Journal of Investigation in Health, Psychology and Education</source>
          <volume>15</volume>
          (
          <year>2025</year>
          )
          <fpage>49</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>G.</given-names>
            <surname>Mestre-Bach</surname>
          </string-name>
          ,
          <string-name>
            <given-names>U.</given-names>
            <surname>Paiva</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L. S. M.</given-names>
            <surname>Iniguez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Beranuy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Martín-Vivar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Mallorquí-Bagué</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Normand</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. C.</given-names>
            <surname>Chicote</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. N.</given-names>
            <surname>Potenza</surname>
          </string-name>
          ,
          <string-name>
            <surname>G.</surname>
          </string-name>
          <article-title>Arrondo, The Association Between Internet-UseDisorder Symptoms and Loneliness: A Systematic Review and Meta-Analysis with a Categorical Approach</article-title>
          ,
          <source>Psychological Medicine</source>
          <volume>55</volume>
          (
          <year>2025</year>
          )
          <article-title>e77</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>E.</given-names>
            <surname>Gabellini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Lucchini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. E.</given-names>
            <surname>Gattoni</surname>
          </string-name>
          ,
          <article-title>Prevalence of Problem Gambling: A Meta-analysis of Recent Empirical Research (</article-title>
          <year>2016</year>
          -2022),
          <source>Journal of Gambling Studies</source>
          <volume>39</volume>
          (
          <year>2023</year>
          )
          <fpage>1027</fpage>
          -
          <lpage>1057</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Allami</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. C.</given-names>
            <surname>Hodgins</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Young</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Brunelle</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Currie</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Dufour</surname>
          </string-name>
          , M.-
          <string-name>
            <given-names>C.</given-names>
            <surname>Flores-Pajot</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Nadeau</surname>
          </string-name>
          ,
          <article-title>A Meta-Analysis of Problem Gambling Risk Factors in the General Adult Population</article-title>
          ,
          <source>Addiction</source>
          <volume>116</volume>
          (
          <year>2021</year>
          )
          <fpage>2968</fpage>
          -
          <lpage>2977</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>C. J.</given-names>
            <surname>Rash</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Weinstock</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. Van</given-names>
            <surname>Patten</surname>
          </string-name>
          ,
          <article-title>A Review of Gambling Disorder and Substance Use Disorders, Substance abuse and rehabilitation (</article-title>
          <year>2016</year>
          )
          <fpage>3</fpage>
          -
          <lpage>13</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <surname>A. M. Mármol-Romero</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Moreno-Muñoz</surname>
            ,
            <given-names>F. M.</given-names>
          </string-name>
          <string-name>
            <surname>Plaza-del Arco</surname>
          </string-name>
          ,
          <string-name>
            <surname>M. D.</surname>
            Molina-González,
            <given-names>M. T.</given-names>
          </string-name>
          <string-name>
            <surname>Martín-Valdivia</surname>
            ,
            <given-names>L. A.</given-names>
          </string-name>
          <string-name>
            <surname>Ureña-López</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Montejo-Raéz</surname>
          </string-name>
          , Overview of MentalRiskES at IberLEF 2023:
          <article-title>Early Detection of Mental Disorders Risk in Spanish</article-title>
          ,
          <source>Procesamiento del Lenguaje Natural</source>
          <volume>71</volume>
          (
          <year>2023</year>
          )
          <fpage>329</fpage>
          -
          <lpage>350</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <surname>A. M. Mármol-Romero</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Moreno-Muñoz</surname>
            ,
            <given-names>F. M.</given-names>
          </string-name>
          <string-name>
            <surname>Plaza-del Arco</surname>
          </string-name>
          ,
          <string-name>
            <surname>M. D.</surname>
            Molina-González,
            <given-names>M. T.</given-names>
          </string-name>
          <string-name>
            <surname>Martín-Valdivia</surname>
            ,
            <given-names>L. A.</given-names>
          </string-name>
          <string-name>
            <surname>Ureña-López</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Montejo-Ráez</surname>
          </string-name>
          , Overview of MentalRiskES at IberLEF 2024:
          <article-title>Early Detection of Mental Disorders Risk in Spanish</article-title>
          ,
          <source>Procesamiento del lenguaje natural 73</source>
          (
          <year>2024</year>
          )
          <fpage>435</fpage>
          -
          <lpage>448</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>J.</given-names>
            <surname>Parapar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Martín-Rodilla</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. E.</given-names>
            <surname>Losada</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Crestani</surname>
          </string-name>
          , Overview of eRisk 2024:
          <article-title>Early Risk Prediction on the Internet, in: Experimental IR Meets Multilinguality, Multimodality, and Interaction</article-title>
          .
          <source>15th International Conference of the CLEF Association, CLEF</source>
          <year>2024</year>
          , Grenoble, France, Springer International,
          <year>2024</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>H.</given-names>
            <surname>Thompson</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Errecalde</surname>
          </string-name>
          ,
          <source>Early Detection of Depression and Eating Disorders in Spanish: UNSL at MentalRiskES</source>
          <year>2023</year>
          ,
          <year>2023</year>
          . arXiv:
          <volume>2310</volume>
          .
          <fpage>20003</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <surname>J. M. Loyola</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Burdisso</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          <string-name>
            <surname>Thompson</surname>
            ,
            <given-names>L. C.</given-names>
          </string-name>
          <string-name>
            <surname>Cagnina</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Errecalde</surname>
          </string-name>
          , UNSL at eRisk 2021:
          <article-title>A Comparison of Three Early Alert Policies for Early Risk Detection</article-title>
          ,
          <source>in: CLEF (Working Notes)</source>
          ,
          <year>2021</year>
          , pp.
          <fpage>992</fpage>
          -
          <lpage>1021</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <surname>J. M. Loyola</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          <string-name>
            <surname>Thompson</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Burdisso</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Errecalde</surname>
          </string-name>
          , UNSL at eRisk 2022:
          <article-title>Decision Policies with History for Early Classification (</article-title>
          <year>2022</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>H.</given-names>
            <surname>Thompson</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Cagnina</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Errecalde</surname>
          </string-name>
          ,
          <article-title>Strategies to Harness the Transformers' Potential: UNSL at eRisk 2023</article-title>
          , in: CLEF (Working Notes),
          <year>2023</year>
          , pp.
          <fpage>791</fpage>
          -
          <lpage>804</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <given-names>H.</given-names>
            <surname>Thompson</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Errecalde</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A</given-names>
            <surname>Time-Aware Approach</surname>
          </string-name>
          to Early Detection of Anorexia: UNSL at eRisk
          <year>2024</year>
          ,
          <year>2024</year>
          . arXiv:
          <volume>2410</volume>
          .
          <fpage>17963</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <surname>A. M. Mármol-Romero</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Álvarez-Ojeda</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Moreno-Muñoz</surname>
            ,
            <given-names>F. M. P.</given-names>
          </string-name>
          <string-name>
            <surname>del Arco</surname>
          </string-name>
          ,
          <string-name>
            <surname>M. D. MolinaGonzález</surname>
          </string-name>
          ,
          <string-name>
            <surname>M.-T.</surname>
            Martín-Valdivia,
            <given-names>L. A.</given-names>
          </string-name>
          <string-name>
            <surname>Ureña-López</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Montejo-Ráez</surname>
          </string-name>
          , Overview of MentalRiskES at IberLEF 2025:
          <article-title>Early Detection of Mental Disorders Risk in Spanish</article-title>
          ,
          <source>Procesamiento del Lenguaje Natural</source>
          <volume>75</volume>
          (
          <year>2025</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [22]
          <string-name>
            <given-names>J.</given-names>
            <surname>Á</surname>
          </string-name>
          .
          <string-name>
            <surname>González-Barba</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          <string-name>
            <surname>Chiruzzo</surname>
            ,
            <given-names>S. M.</given-names>
          </string-name>
          <string-name>
            <surname>Jiménez-Zafra</surname>
          </string-name>
          ,
          <article-title>Overview of IberLEF 2025: Natural Language Processing Challenges for Spanish and other Iberian Languages, in: Proceedings of the Iberian Languages Evaluation Forum (IberLEF 2025), co-located with the 41st Conference of the Spanish Society for Natural Language Processing (SEPLN 2025), CEUR-WS</article-title>
          . org,
          <year>2025</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>