<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>HIT-SCIR@eRisk2025: Exploring the Potential of a Learnable Screening Model and Risk Post Bufer-Based Framework for Contextualized Early Prediction of Depression on Social Media</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Yuzhe Zi</string-name>
          <email>yuzhezi@ir.hit.edu.cn</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Bichen Wang</string-name>
          <email>bichenwang@ir.hit.edu.cn</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Yanyan Zhao</string-name>
          <email>yyzhao@ir.hit.edu.cn</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Bing Qin</string-name>
          <email>qinb@ir.hit.edu.cn</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>CLEF 2025 Working Notes</institution>
          ,
          <addr-line>9 - 12</addr-line>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Research Center for Social Computing and Interactive Robotics, Harbin Institute of Technology</institution>
        </aff>
      </contrib-group>
      <abstract>
        <p>Task 2 of the eRisk lab at CLEF 2025 focuses on the contextualized early detection of depression using user posts from Reddit. The HIT-SCIR team participate in this task, submitting five runs based on diferent configurations of our proposed Learnable Screening Model and Risk Post Bufer-Based Framework. Our approach involves several key components: contextual data augmentation using Large Language Models (LLMs) to simulate social interactions and generate summaries for training data; a core end-to-end learnable risky post screening model guided by symptom descriptions from established psychiatric scales; and a depression risk detector utilizing MentalBERT for classification. The oficial results on the test data demonstrate that our framework ranked ifrst across several evaluation metrics, notably F1-score, ERDE50, Flatency, and various ranking-based measures. This note describes the architecture, experimental setup, and performance analysis of our system, highlighting the value of integrating psychiatric knowledge into a learnable, context-aware model.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Early Depression Detection</kwd>
        <kwd>Social Media</kwd>
        <kwd>Psychiatric Scale</kwd>
        <kwd>Contextualized Detection</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        According to the World Health Organization (WHO), depression afects approximately 3.8% of the
global population1. In the United States, nearly 15% of adults experience at least one major depressive
episode during their lifetime [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. Early risk prediction represents an emerging research area with broad
applications, such as identifying individuals at risk of mental disorders—a prominent societal concern.
In depression detection, Early Risk Detection (ERD) is particularly crucial due to its predictive timeliness,
as early warnings facilitate more timely intervention windows.
      </p>
      <p>
        With the proliferation of the internet, social media platforms have become conventional avenues
for individuals to openly express their thoughts and emotions [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. Data from these platforms ofer
abundant resources for sentiment analysis and mental health inference [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. Globally, considerable
research focuses on leveraging social media for depression detection to mitigate the severe consequences
of this condition [
        <xref ref-type="bibr" rid="ref4 ref5">4, 5</xref>
        ]. The CLEF eRisk 2025 Task 2 [
        <xref ref-type="bibr" rid="ref6 ref7">6, 7</xref>
        ] introduces a novel scenario for depression
detection by incorporating complete conversational contexts. Unlike previous eRisk editions, which
released only isolated posts from individual users, this year’s task provides entire Reddit discussion
threads involving the target user. This allows participating systems to access not only the target user’s
posts but also all interactions and their relational structures within the discussion threads. The task aims
to simulate real-world scenarios where identifying depression necessitates the analysis of multi-party
conversations. This edition features a dataset constructed from user posts on the Reddit platform. Our
team, HIT-SCIR, participated in this task and achieved strong performance.
      </p>
      <p>
        This study explores the efectiveness of a technique that combines an online detection algorithm
based on a dynamic queue of at-risk posts [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] with an end-to-end learnable risk post screening scheme
guided by psychological questionnaires, for the eRisk 2025 Task 2. Addressing the specifics of Task 2,
which does not provide training data with user interactions, we employ LLMs to augment posts with
potential contextual information to construct relevant training data.
      </p>
      <p>
        Inspired by the implementation of Zhang et al. [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ], our architecture comprises a risky post screener
and a depression risk detector. During training, the screener calculates a risk score for each post based
on its cosine similarity with descriptions from psychological scales. Subsequently, posts exhibiting
higher risk scores are filtered for depression risk detection. The detector employs a Hierarchical
Attention Network (HAN), utilizing BERT to acquire embedding representations for individual posts.
It then models inter-post interactions using a Transformer and attention mechanisms, ultimately
generating user features. Given the task’s strong dependence on psychological knowledge, we leverage
MentalBERT[
        <xref ref-type="bibr" rid="ref9">9</xref>
        ], pre-trained on psychology-related data, to generate post embeddings. Furthermore, the
Straight-Through Estimator (STE) technique is employed to jointly train the screener and the detector.
For final early detection testing, we utilize a dynamic risky post queue in conjunction with diferent
alerting strategies for early depression detection. Through five-fold cross-validation, we evaluate the
model’s performance under various parameter settings. The top three performing models are selected
for ensemble voting, and diferent early decision strategies are configured across five submission runs.
The results indicate that our method outperforms other participating systems on most evaluation
metrics.
      </p>
      <p>The screener calculates a risk score for each post based on its cosine similarity with descriptions from
psychological scales. Subsequently, posts exhibiting higher risk scores are filtered for depression risk
detection. The detector utilizes MentalBERT to acquire embedding representations for individual posts
and their associated interaction information. It then models inter-post interactions using a Transformer
and attention mechanisms, ultimately generating user features. Furthermore, the Straight-Through
Estimator (STE) technique is employed to jointly train the screener and the detector. The screener
updates its results based on the detector’s results. For final early detection testing, we use a dynamic
risky post queue in conjunction with diferent alerting strategies for early depression detection. We
evaluate the model’s performance under various parameter settings through five-fold cross-validation.
The top three performing models are selected for ensemble voting, and diferent early decision strategies
are configured across five submission runs. The results indicate that our method outperforms other
participating systems on most evaluation metrics.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Proposed Frameworks</title>
      <p>The remainder of this paper is structured as follows: Section 2 details the technical framework, Section
3 reports and analyzes the experimental results, and Section 4 concludes the study and discusses future
directions.</p>
      <p>Our overall process begins with a Contextual Data Augmentation phase. The subsequent
framework, as illustrated in Figure 1, comprises two core stages: Psychiatric Scale-guided Risky Post
Screening and Dynamic User-Level Early Risk Assessment Strategy.</p>
      <sec id="sec-2-1">
        <title>2.1. Contextual Data Augmentation</title>
        <p>The training data provided in eRisk2025 Task 2 consists of isolated user posts that inherently lack
interactive context (e.g., comments or replies). However, posts encountered in test scenarios typically include
associated contextual information, which proves crucial for accurate understanding and prediction.
To bridge this discrepancy between the context-scarce training data and potentially context-rich test
environments, and to enable our model to efectively leverage contextual information during testing, we</p>
        <p>Posts
Comments
...
...
...
...
...</p>
        <p>Psychiatric Scale-Guided Screening Module</p>
        <p>Depression Scale
I feel depressed.</p>
        <p>I always cry.</p>
        <p>I am treating my depression.
                     ……
I am too tired to do things.
...</p>
        <p>Screening</p>
        <p>Model
...</p>
        <p>Comment
Summarization
...</p>
        <p>Dynamic
Memory</p>
        <p>Buffer
Score</p>
        <p>Risk Prediction</p>
        <p>Depressed
Non-Depressed
employ LLMs to generate simulated contextual information for the original user posts in the training
data. This stage involves two specific steps:</p>
        <p>
          Generation of Simulated Social Interactions: For each original post  in the training set, a
pretrained generative LLM, denoted as LLMgen (e.g., we use the Phi-4 model [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ] in this study), is utilized
to generate a set of simulated comments {,1, ,2, . . . , , }. Here,  is a pre-defined number of
generated comments. This process is guided by a specific instruction prompt, Promptcomment, which
directs the model to generate diverse and relevant comments based on the post content, outputting
them in a structured format (e.g., JSON). This process can be represented as:
        </p>
        <p>{, }=1 = LLMgen(, Promptcomment)
This step aims to supplement the original post with potential social reactions and discussion points,
thereby providing a richer semantic context.</p>
        <p>Summarization of Comment Content: Considering that multiple generated comments might
contain redundant information or introduce unnecessary noise, we further summarize the comment
set {, } generated in the previous step to extract its core semantics. This step also employs an LLM,
LLMsum (in this study, we use the Phi-4 model), guided by a specific summarization instruction prompt,
Promptsummary. The generated summary is denoted as .</p>
        <p>= LLMsum({, }=1, Promptsummary)
Through this contextual augmentation process, each original post  is equipped with a highly condensed
comment summary  generated by the LLM. Both  and  serve as inputs to the subsequent risk
prediction model.
(1)
(2)</p>
      </sec>
      <sec id="sec-2-2">
        <title>2.2. Psychiatric Scale-guided Risky Post Screening</title>
        <p>We focus on how to extract key depression-related information from users’ post histories. Our approach
involves using psychological scales to screen for useful posts and then detect depression. To prevent
cascading errors, we have implemented an end-to-end joint training approach for both the screening
and detection stages. This model facilitates a seamless flow between identifying relevant posts and
ultimately detecting depression.</p>
        <sec id="sec-2-2-1">
          <title>2.2.1. Input Feature Representation</title>
          <p>
            The model receives the user’s original post  and its corresponding augmented context (comment
summary ) as input. We employ a pre-trained text encoder, EncoderFE, which is MentalBERT [
            <xref ref-type="bibr" rid="ref9">9</xref>
            ] — a
BERT variant optimized for mental health texts — to independently convert these two text segments
into fixed-dimensional dense vector representations:
e = EncoderFE()
e = EncoderFE()
(3)
(4)
(5)
(6)
(7)
representation for the post, x = [e ; e ].
          </p>
        </sec>
        <sec id="sec-2-2-2">
          <title>2.2.2. Psychiatric Scale-Guided Screening Module</title>
          <p>This module aims to assess the importance of each post based on psychiatric scales and to screen for
representative posts.</p>
          <p>
            Construction Symptom Templates and Embeddings: For the screening process, we first define a
scale item template set  = {1, 2, . . . ,  }. These entries  (symptom templates) are derived from
widely used and validated psychological depression assessment scales [
            <xref ref-type="bibr" rid="ref11">11</xref>
            ], and are supplemented by
three direct descriptive sentences for depression. All descriptions are detailed in Appendix A. These
 represent clinically recognized criteria and manifestations of depression as defined by these scales.
Each scale item template  is encoded into an embedding vector e using the text embedding model
EncoderE5, specifically multilingual-e5-large-instruct [
            <xref ref-type="bibr" rid="ref12">12</xref>
            ]. To enhance semantic alignment, we prepend
a task-specific prompt specifically to these scale item descriptions before they are input to
EncoderE5.
          </p>
          <p>The prompt used is:</p>
          <p>Subsequently, these two embedding vectors are concatenated to form the comprehensive feature
The collection of importance scores for all posts is denoted as r = {1, 2, . . . , }.</p>
          <p>Given a description of depression symptoms, retrieve the posts from
user submissions that exhibit those symptoms.
each template as:
This process for obtaining the scale item template embeddings E = {e1 , . . . , e } is represented for
e = EncoderE5(Prompt + )
content for this embedding step:
where ‘Prompt‘ denotes the aforementioned task instruction, and + signifies concatenation.</p>
          <p>Post Risk Score Calculation: For each user post , its textual content is directly encoded using
the same EncoderE5 to obtain the post embedding e′ . Crucially, no prompt is prepended to the post
Then, we calculate the cosine similarity between this post embedding e′ (from Eq. 6) and all scale
item embeddings e (from Eq. 5) in the knowledge base. The maximum similarity value is taken as the
importance score  of post :</p>
          <p>e′ = EncoderE5()
 =</p>
          <p>max
1≤ ≤ 
︃(</p>
          <p>e′ · e
‖e′ ‖‖e ‖
)︃</p>
          <p>Diferentiable Dynamic Post Screening Mask Generation : Our goal is to select the top ′ posts
with the highest importance scores (where ′ can be a proportion or a fixed number). Traditional
Top-′ operations are non-diferentiable. To enable end-to-end learning, we employ a Straight-Through
Estimator (STE) method that allows for gradient propagation. First, based on the sequence of importance
scores [1, . . . , ], a threshold  is calculated (e.g., determined via quantile to identify the cut-of for
the top ′ scores). A preliminary hard selection mask m′ is generated:
where I is the indicator function. To achieve diferentiability, we construct the final mask
m as follows:
m′ = I{≥  }
m = r + m′ − detach(r)
where the detach(· ) operation prevents gradients from flowing back through its arguments. This
construction ensures that during the forward pass, the mask  behaves similarly to the hard
selection ′, while during the backward pass, gradients can flow to the importance score r calculation,
allowing parameters of the screening process to be optimized. This results in the mask sequence
m = [1, 2, . . . , ].</p>
        </sec>
        <sec id="sec-2-2-3">
          <title>2.2.3. Transformer Encoding and Final Classification</title>
          <p>The mask mentioned above helps us filter out irrelevant information. We will now integrate this mask
into subsequent operations in a diferentiable form.</p>
          <p>Mask-Guided Sequence Encoding: The user’s sequence of comprehensive post features
[x1, . . . , x] and its corresponding diferentiable dynamic feature selection mask sequence [1, . . . , ]
are fed into one or more Transformer encoder layers (TransformerEncoderLayer). Within the
selfattention mechanism, we modify the masking mechanism for all attention layers as follows:
S =</p>
          <p>QKT</p>
          <p>^
√ , , =</p>
          <p>exp(, )
∑︀=1 exp(,)
In this modification, where  represents the mask value  for the -th post in the sequence being
attended to, the self-attention mechanism ensures that during training, posts where  = 0 (referring
to the mask of the -th post in the overall user sequence) do not participate in subsequent computations;
only posts with  = 1 are considered. The detection model requires access to all posts during training
to ensure updates to the screening process. However, during inference, we can discard the posts where
 = 0, which does not increase the model’s inference time. The output of the Transformer layers is:
[h1, h2, . . . , h] = TransformerEncoderLayer([x1, . . . , x], attention_mask = [1, . . . , ]) (11)
Here, h is the context-aware representation of the post after Transformer encoding.</p>
          <p>User-Level Representation Aggregation: To obtain a single vector representing the user’s overall
state, we aggregate the output sequence from the Transformer encoder [h1, . . . , h]. huser is obtained
through avg-pooling.</p>
          <p>Risk Prediction: Finally, the aggregated user-level representation huser is fed into a multi-layer
perceptron (MLP) classifier, which outputs the probability ^ of the user having depression risk:
^ =  (MLP(huser))
(12)
where  is the Sigmoid activation function. The entire model is trained end-to-end by minimizing the
binary cross-entropy loss (BCEWithLogitsLoss) between the predicted probabilities and the true labels.
We jointly optimize the filtering of important posts and the final classification.
(8)
(9)
(10)</p>
        </sec>
      </sec>
      <sec id="sec-2-3">
        <title>2.3. Dynamic User-Level Early Risk Assessment Strategy</title>
        <p>To more authentically simulate the application scenarios of early risk detection, we adopt the following
dynamic assessment process:</p>
        <p>Dynamic Memory Bufer : For each user, we maintain a dynamic memory bufer of capacity .
This bufer stores the  posts from the user’s recent history that are considered most relevant to
depressive manifestations, based on their symptom relevance scores . When a new post is published
by the user, its  value is compared with those of the posts currently in the bufer. If the new post’s
relevance is higher and the bufer is full, the post with the lowest relevance is removed, and the new
post is added. This ensures that the model always makes judgments based on the user’s most relevant
recent posts.</p>
        <p>Sequential Prediction and Alert Mechanism: User posts are treated as a time series. At each time
step (i.e., after a user publishes a new post and the memory bufer is potentially updated), the model
(Section 2.2) performs a depression risk prediction based on the current set of posts in the memory bufer.
If the model’s predicted probability exceeds a pre-set decision threshold   for  consecutive
times (e.g.,  = 2), the system identifies the user as being at risk of depression and triggers an alert.</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. Experimental Evaluation</title>
      <sec id="sec-3-1">
        <title>3.1. Datasets</title>
        <p>The organizers provide task-specific corpora constructed from user posts on Reddit during designated
time periods. The data is distributed in XML format, containing user IDs, timestamps, post titles, and
textual content. For Task 2, the training corpus features binary classification labels distinguishing
between depression cases and control groups. Notably, all classifiers for Task 2 were trained exclusively
on this dataset without incorporating any external data sources. This approach ensures the evaluation
reflects the models’ performance under controlled conditions.</p>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. Experimental Setup</title>
        <p>
          We submit the results of five runs, each based on a distinct decision-making approach for early detection.
The core of all approaches is a risk prediction model implemented as a voting ensemble. This ensemble
was constructed by selecting the three best-performing individual detection models from a candidate
pool, a selection process guided by a five-fold cross-validation method on the training corpus. The five
submitted runs (i.e., decision-making approaches) utilize this same pre-selected voting ensemble but
difer in their operational parameters (  , ) as follows:
1. Run 0:  =0.5, =1
2. Run 1:  =0.6, =1
3. Run 2:  =0.7, =1
4. Run 3:  =0.5, =2
5. Run 4:  =0.5, =3
The performance of the proposed frameworks on the training set is quantitatively assessed using
precision ( ), recall (), and F1-score ( 1) [
          <xref ref-type="bibr" rid="ref13">13</xref>
          ]. Additionally, the organizers evaluate the run
results across multiple dimensions, including 5 [
          <xref ref-type="bibr" rid="ref14">14</xref>
          ], 50 [
          <xref ref-type="bibr" rid="ref14">14</xref>
          ], latency true positive rate
(  ) [
          <xref ref-type="bibr" rid="ref6 ref7">6, 7</xref>
          ], speed metric (speed) [
          <xref ref-type="bibr" rid="ref6 ref7">6, 7</xref>
          ], and latency-weighted F1-score () [
          <xref ref-type="bibr" rid="ref6 ref7">6, 7</xref>
          ].
        </p>
      </sec>
      <sec id="sec-3-3">
        <title>3.3. Analysis of Results</title>
        <p>Table 1 presents the performance of the five HIT-SCIR test runs on decision metrics (Precision  / Recall
 /  1-score) and temporal metrics (5 / 50 /   / speed / ). Among all
participating teams, HIT-SCIR-4 achieves the first rank in  1-score (0.85) and 50 (0.03), and
second rank in 5 (0.06). Furthermore, HIT-SCIR-4 and HIT-SCIR-2 are tied for first place in the
1.00
1.00
0.58
1.00
1.00
0.84
 metric (both 0.82). Regarding other metrics within the team: HIT-SCIR-0 exhibits the highest
Recall () (0.96). HIT-SCIR-0, HIT-SCIR-1, and HIT-SCIR-2 perform identically and better than the other
two submissions on   (4.00) and speed (0.99); these three runs also outperform HIT-SCIR-3
(0.08) and HIT-SCIR-4 (0.09) on the 5 metric (0.06).</p>
        <p>Ranking-based evaluation employs standard information retrieval metrics such as Precision at 10
( @10) and Normalized Discounted Cumulative Gain ( ) to assess user risk levels sorted in
descending order. Table 2 shows that the HIT-SCIR team’s runs achieve first place in the vast majority
of ranking metrics. Specifically: First, in the "1 writing" evaluation, all team runs rank first in  @10
(1.00) and  @10 (1.00); however, the team does not achieve the top rank for the  @100
metric (0.58). Second, when evaluating with one hundred posts ("100 writings"), all team runs rank first
in  @10 (1.00) and  @10 (1.00). For the  @100 metric, the results of HIT-SCIR-0,
HITSCIR-1, HIT-SCIR-2, and HIT-SCIR-3 (0.84) rank first, while HIT-SCIR-4 (0.83) performs slightly lower
and does not achieve first place in this specific instance. Third, when evaluating with five hundred posts
("500 writings"), all team runs achieve first place in all three metrics:  @10 (1.00),  @10 (1.00),
and  @100 (0.89). Finally, when evaluating with one thousand posts ("1000 writings"), all team
runs also achieve first place in all three metrics:  @10 (1.00),  @10 (1.00), and  @100
(0.90).</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Conclusion</title>
      <p>The eRisk 2025 Task 2 underscores the complexities of early, contextualized depression detection from
social media data. Our HIT-SCIR team proposes and evaluates a multi-stage framework centered around
a learnable, psychiatric scale-guided screening model. This model is augmented by Large Language
Models (LLMs), which are employed for two key purposes: first, to perform contextual augmentation on
the training data by simulating interactions, and second, to generate concise summaries of contextual
information (e.g., comment threads) for both training and testing data. This summarization process
ofers the advantage of distilling relevant contextual signals while filtering out irrelevant information,
thereby facilitating more focused and eficient processing by subsequent model components. Empirical
analysis from our five submitted runs reveals the significant benefits of this integrated approach. The
direct incorporation of knowledge derived from psychiatric scales into a diferentiable screening
mechanism, combined with the LLM-refined contextual information and the specialized representations from
MentalBERT, allows our models to efectively identify and prioritize risk-indicative posts. This results
in leading performance on key metrics such as  1-score, 50, and . While traditional
methods might struggle with the nuanced and evolving nature of online discourse, our end-to-end
learnable system demonstrates robust adaptability. Future work will focus on refining the contextual
augmentation techniques, exploring more sophisticated modeling of multi-party conversational
dynamics within threads, and further enhancing the screening module’s sensitivity to subtle or emerging
signs of depression. We also plan to investigate the framework’s generalizability to other mental health
conditions.</p>
    </sec>
    <sec id="sec-5">
      <title>Declaration on Generative AI</title>
      <p>During the preparation of this work, the authors use Gemini 2.5 Pro for grammar and spelling checks.
After using this tool, the authors carefully review and edit the content as needed and take full responsibility
for the publication’s content.</p>
    </sec>
    <sec id="sec-6">
      <title>A. Depression Templates</title>
      <p>
        Here we provide the detailed templates in Table 3. Following prior work [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ], we employ the same
combination of 3 direct depression descriptions and the 21 indirect symptoms derived from the Beck
Depression Inventory-II (BDI-II) [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ].
      </p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>R. C.</given-names>
            <surname>Kessler</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Berglund</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Demler</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Jin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K. R.</given-names>
            <surname>Merikangas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E. E.</given-names>
            <surname>Walters</surname>
          </string-name>
          ,
          <article-title>Lifetime prevalence and age-of-onset distributions of dsm-iv disorders in the national comorbidity survey replication</article-title>
          ,
          <source>Archives of general psychiatry 62</source>
          (
          <year>2005</year>
          )
          <fpage>593</fpage>
          -
          <lpage>602</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>M. D.</given-names>
            <surname>Choudhury</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Gamon</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Counts</surname>
          </string-name>
          , E. Horvitz, Predicting Depression via Social Media,
          <source>Proceedings of the International AAAI Conference on Web and Social Media</source>
          <volume>7</volume>
          (
          <year>2013</year>
          )
          <fpage>128</fpage>
          -
          <lpage>137</lpage>
          . doi:
          <volume>10</volume>
          .1609/icwsm.v7i1.
          <fpage>14432</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <surname>M. De Choudhury</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Counts</surname>
          </string-name>
          , E. Horvitz,
          <article-title>Social media as a measurement tool of depression in populations</article-title>
          ,
          <source>in: Proceedings of the 5th Annual ACM Web Science Conference</source>
          , WebSci '13,
          <string-name>
            <surname>Association</surname>
          </string-name>
          for Computing Machinery, New York, NY, USA,
          <year>2013</year>
          , pp.
          <fpage>47</fpage>
          -
          <lpage>56</lpage>
          . doi:
          <volume>10</volume>
          .1145/2464464. 2464480.
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>L.</given-names>
            <surname>Zhou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Yang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <surname>HARNESSING SOCIAL MEDIA FOR HEALTH INFORMATION</surname>
            <given-names>MANAGEMENT</given-names>
          </string-name>
          ,
          <source>Electronic Commerce Research and Applications</source>
          <volume>27</volume>
          (
          <year>2018</year>
          )
          <fpage>139</fpage>
          -
          <lpage>151</lpage>
          . doi:
          <volume>10</volume>
          .1016/j.elerap.
          <year>2017</year>
          .
          <volume>12</volume>
          .003.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>A.</given-names>
            <surname>Malhotra</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Jindal</surname>
          </string-name>
          ,
          <article-title>Deep learning techniques for suicide and depression detection from online social media: A scoping review</article-title>
          ,
          <source>Applied Soft Computing</source>
          <volume>130</volume>
          (
          <year>2022</year>
          )
          <article-title>109713</article-title>
          . doi:
          <volume>10</volume>
          .1016/j.asoc.
          <year>2022</year>
          .
          <volume>109713</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>J.</given-names>
            <surname>Parapar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Perez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Crestani</surname>
          </string-name>
          , Overview of erisk 2025:
          <article-title>Early risk prediction on the internet, in: Experimental IR Meets Multilinguality</article-title>
          , Multimodality, and Interaction - 16th
          <source>International Conference of the CLEF Association, CLEF</source>
          <year>2025</year>
          , Madrid, Spain, September 9-
          <issue>12</issue>
          ,
          <year>2025</year>
          , Proceedings,
          <string-name>
            <surname>Part</surname>
            <given-names>II</given-names>
          </string-name>
          , volume To be
          <source>published of Lecture Notes in Computer Science</source>
          , Springer,
          <year>2025</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>J.</given-names>
            <surname>Parapar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Perez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Crestani</surname>
          </string-name>
          , Overview of erisk 2025:
          <article-title>Early risk prediction on the internet (extended overview)</article-title>
          ,
          <source>in: Working Notes of the Conference and Labs of the Evaluation Forum (CLEF</source>
          <year>2025</year>
          ), Madrid, Spain,
          <fpage>9</fpage>
          -
          <issue>12</issue>
          <year>September</year>
          ,
          <year>2025</year>
          , volume To be published of CEUR Workshop Proceedings, CEUR-WS.org,
          <year>2025</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>Z.</given-names>
            <surname>Zhang</surname>
          </string-name>
          , S. Chen,
          <string-name>
            <given-names>M.</given-names>
            <surname>Wu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K. Q.</given-names>
            <surname>Zhu</surname>
          </string-name>
          ,
          <article-title>Psychiatric Scale Guided Risky Post Screening for Early Detection of Depression</article-title>
          , arXiv,
          <year>2022</year>
          . doi:
          <volume>10</volume>
          .48550/ARXIV.2205.09497.
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>S.</given-names>
            <surname>Ji</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Zhang</surname>
          </string-name>
          , L. Ansari,
          <string-name>
            <given-names>J.</given-names>
            <surname>Fu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Tiwari</surname>
          </string-name>
          , E. Cambria,
          <article-title>MentalBERT: Publicly Available Pretrained Language Models for Mental Healthcare</article-title>
          , in: N.
          <string-name>
            <surname>Calzolari</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          <string-name>
            <surname>Béchet</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Blache</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          <string-name>
            <surname>Choukri</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Cieri</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          <string-name>
            <surname>Declerck</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Goggi</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          <string-name>
            <surname>Isahara</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          <string-name>
            <surname>Maegaard</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Mariani</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          <string-name>
            <surname>Mazo</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Odijk</surname>
          </string-name>
          , S. Piperidis (Eds.),
          <source>Proceedings of the Thirteenth Language Resources and Evaluation Conference</source>
          , European Language Resources Association, Marseille, France,
          <year>2022</year>
          , pp.
          <fpage>7184</fpage>
          -
          <lpage>7190</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>M.</given-names>
            <surname>Abdin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Aneja</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Behl</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Bubeck</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Eldan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Gunasekar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Harrison</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. J.</given-names>
            <surname>Hewett</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Javaheripi</surname>
          </string-name>
          , P. Kaufmann,
          <string-name>
            <given-names>J. R.</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y. T.</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C. C. T.</given-names>
            <surname>Mendes</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Nguyen</surname>
          </string-name>
          , E. Price, G. de Rosa,
          <string-name>
            <given-names>O.</given-names>
            <surname>Saarikivi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Salim</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Shah</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Ward</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Wu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Yu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Zhang</surname>
          </string-name>
          , Y. Zhang, Phi-4
          <source>Technical Report</source>
          ,
          <year>2024</year>
          . doi:
          <volume>10</volume>
          .48550/arXiv.2412.08905. arXiv:
          <volume>2412</volume>
          .
          <fpage>08905</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>A. T.</given-names>
            <surname>Beck</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. A.</given-names>
            <surname>Steer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G. K.</given-names>
            <surname>Brown</surname>
          </string-name>
          , BDI-II, Beck Depression Inventory: Manual,
          <string-name>
            <given-names>Psychological</given-names>
            <surname>Corporation</surname>
          </string-name>
          ,
          <year>1996</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>L.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Yang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Huang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Yang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Majumder</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Wei</surname>
          </string-name>
          ,
          <article-title>Multilingual e5 text embeddings: A technical report</article-title>
          , arXiv preprint arXiv:
          <volume>2402</volume>
          .05672 (
          <year>2024</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>T.</given-names>
            <surname>Basu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Goldsworthy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G. V.</given-names>
            <surname>Gkoutos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A Sentence</given-names>
            <surname>Classification</surname>
          </string-name>
          <article-title>Framework to Identify Geometric Errors in Radiation Therapy from Relevant Literature</article-title>
          ,
          <source>Information</source>
          <volume>12</volume>
          (
          <year>2021</year>
          )
          <article-title>139</article-title>
          . doi:
          <volume>10</volume>
          .3390/ info12040139.
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>D. E.</given-names>
            <surname>Losada</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Crestani</surname>
          </string-name>
          ,
          <article-title>A Test Collection for Research on Depression and Language Use</article-title>
          , in: N.
          <string-name>
            <surname>Fuhr</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Quaresma</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          <string-name>
            <surname>Gonçalves</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          <string-name>
            <surname>Larsen</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          <string-name>
            <surname>Balog</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Macdonald</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          <string-name>
            <surname>Cappellato</surname>
          </string-name>
          , N. Ferro (Eds.),
          <source>Experimental IR Meets Multilinguality, Multimodality, and Interaction</source>
          , Springer International Publishing, Cham,
          <year>2016</year>
          , pp.
          <fpage>28</fpage>
          -
          <lpage>39</lpage>
          . doi:
          <volume>10</volume>
          .1007/978-3-
          <fpage>319</fpage>
          -44564-
          <issue>9</issue>
          _
          <fpage>3</fpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>