Natural Language Processing in the Detection and
                         Treatment of Mental Health Issues
                         Alba María Mármol-Romero
                         Computer Science Department, SINAI, CEATIC, Universidad de Jaén, 23071, Spain


                                      Abstract
                                      Mental health issues are an increasing global public health concern. With the rise of social media, there is growing
                                      interest in the early detection of mental disorders by analyzing user posts. This research project aims to leverage
                                      Artificial Intelligence (AI) to enhance mental wellness. We operate initially under two primary objectives: first,
                                      that Natural Language Processing (NLP) can effectively identify signs of mental disorders or emotional distress in
                                      user messages over time; and second, that Large Language Models (LLMs) can provide high-quality information to
                                      mental health professionals. To test these hypotheses, we explore key research questions, including the feasibility
                                      of detecting emotional symptoms in text messages and the capability of chatbots to collect high-quality data for
                                      professional use. This study focuses primarily on the Spanish language.

                                      Keywords
                                      Mental Health, Mental Disorders Detection, Large Language Model, Early Risk Detection, Dialogue systems


                         1. Introduction
                         According to the World Health Organization (WHO), mental health is a state of mental well-being that
                         enables individuals to cope with life’s stresses, realize their abilities, learn and work effectively, and
                         contribute to their communities. The absence of mental health could carry on mental health problems
                         such as eating disorder (ED), depression or anxiety disorder, being the last two most prevalent nowadays.
                         In 2019, one in eight people in the world was diagnosed with one or more mental health issues and
                         it is estimated that only one year before COVID-19 the number of people suffering from anxiety or
                         depression has increased by about 30% [1, 2]. Because of these data, both effective treatment and
                         prevention or early detection of signs of mental health problems are important social branches [3].
                            In addition, due to the increasing use of social media, people often express their emotional problems
                         or thoughts on the Internet in search of comfort, support or to unwind [4, 5]. Given the large amount
                         of information in text format (natural language) of this type accessible on social media, NLP plays an
                         important role in detecting sentiments and emotions [6] or hate speech [7]. Moreover, the evaluation
                         and treatment of mental health issues also heavily rely on natural language which makes NLP and LLMs
                         potentially valuable tools for interpreting users’ mental and emotional states through their written
                         communication [8]. In recent years, research in NLP and computational social science has increasingly
                         focused on detecting mental health issues through online text data, such as social media content [9, 10].
                         For instance, studies have shown NLP can analyze large datasets from social media platforms like
                         Twitter or Facebook to detect subtle cues of mental health conditions by examining language patterns
                         over time [11].
                            For this reason, our research focuses on applying NLP and AI to identify in an early way the risk of a
                         user suffering from a disorder in social media and to develop systems and tools to help professionals
                         with mental health in their work. By harnessing the potential of these technologies, we strive not only
                         to improve our understanding of the dynamics of mental health in digital spaces but also to provide
                         scientific and professional communities with the results of our mental well-being efforts. Ultimately,
                         this work aims to contribute to the advancement of the development of tools in this field for Spanish
                         speakers, since the research carried out so far has focused on the English language.

                          Doctoral Symposium on Natural Language Processing, 26 September 2024, Valladolid, Spain.
                          $ amarmol@ujaen.es (A. M. Mármol-Romero)
                           0000-0001-7952-4541 (A. M. Mármol-Romero)
                                   © 2024 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).


CEUR
                  ceur-ws.org
Workshop      ISSN 1613-0073
Proceedings
2. Related work
The detection of whether a person is suffering from a mental disorder and the implementation of
treatment are two of the important tasks the scientific community has been tackling in recent years.
Still, the speed factor is one of the most important factors that is being taken into account. The early
detection of signs of mental disorders is important, since, undetected, mental disorders can develop
into more serious consequences, constituting a major predictive factor of suicide [12]. Therefore,
both the early detection of a person’s risk of suffering from a mental health problem and the rapid
implementation of treatment are two growing tasks.

2.1. Early risk detection
The concept of early risk detection involves promptly identifying potential signs of mental health issues,
hate speech, and other concerns on social networks [13]. This concept is central to the well-known
eRisk shared task eRisk1 which began in 2017 and is hosted annually at the Conference and Labs
of the Evaluation Forum (CLEF). Many studies on early risk prediction originate from eRisk, which
has addressed early risk detection of gambling [14], self-harm [15], and disorders such as anorexia or
depression [16]. These tasks consist of sequentially processing subjects’ posts on Reddit. This simulates
a real-time analysis of social media to evaluate the early risk detection capabilities of various systems
teams.
   So far, several datasets have been developed to identify mental health problems like stress, depression
or anxiety [17, 18, 19], as well as the severity of depression [20, 21] or suicide risk [22]. However,
to carry out the task of early detection it is necessary to get a user-level dataset that permits the
application of measures such as Early Risk Detection Error (ERDE) [23]. Twitter and Reddit are the
most studied platforms for mental health research [24], especially in early detection. All eRisk datasets
come from Reddit users but there are other datasets annotated at the user level extracted from other
social networks like Twitter-STMHD [25] which is a collection of user Twitter profiles suffering from
mental health disorders. These datasets often focus on English-language data, but there are exceptions.
For instance, Villa-Pérez et al. [26] created two datasets in English and Spanish that comprise the
timeline of Twitter users who explicitly reported in one or more of their posts having been diagnosed
with one disorder. Additionally, the dataset MentalRiskES [27] is a new open-sourced corpus for the
early detection of mental disorders in Spanish, focusing on eating disorders, depression, and anxiety.
It consists of user messages posted on groups within the Telegram message platform. This corpus
was used by the MentalRiskES shared task [28] organized at the Iberian Languages Evaluation Forum
(IberLEF). This task follows the same structure as the eRisk task focusing on the early risk detection of
mental disorders.
   However, one of the emerging branches in recent years comes from the need for explainability of a
risk prediction [29, 30]. To prove a person by their message thread may be at risk of suffering from a
mental problem the latest research is focusing not so much on predicting whether or not they suffer
from a mental disorder but on whether or not they suffer from certain symptoms. eRisk 2023 edition [31]
introduced a novel task that consisted of ranking phrases according to their relevance to standardised
symptoms of depression. They used the BDI-Sen [32] dataset which is a corpus label at the Reddit post
sentences level and covers all the symptoms present in the Beck Depression Inventory-II (BDI-II). As
such, PsySym [33] is another annotated symptom identification corpus of multiple psychiatric disorders
for Reddit post sentences in the English language too.

2.2. Dialogue systems for mental health
In the medical field, conversational agents (CAs) are gaining popularity, particularly for addressing
health-related inquiries, a functionality that can also be expanded to mental health concerns [34, 35].
So far, dialogue systems have mainly assisted users in performing self-reporting methods, however, it is
1
    https://erisk.irlab.org/
interesting to see how current studies take advantage of the capabilities of conversational agents and
smartphones to evaluate these more efficiently [36].
   Replika2 [37] is an AI companion chatbot designed to engage users in meaningful conversations.
While not exclusively focused on mental health, many users find comfort in talking to Replika about
their feelings and emotions. Replika adapts to the user’s conversational style and provides emotional
support, making it a versatile tool for mental well-being. But nowadays some popular chatbots such as
Woebot3 [38] provide support for emotional problems using NLP and AI to understand users’ emotions.
It was originally developed as a study to mitigate symptoms of anxiety and depression in adolescents
and, as of now, Woebot is only available to new users in the United States who are participating in
the study. Youper4 [39] is another popular AI-powered emotional health assistant that uses CBT and
other therapeutic techniques to guide users through conversations aimed at improving their mental
well-being. Youper helps users track their moods, understand their emotions, and develop healthier
mental habits. For Spanish speakers, Wysa5 [40] is a significant chatbot in this space. Like Woebot,
Wysa uses NLP and AI to understand users’ emotions but extends its support by integrating cognitive
behavioural therapy (CBT), mindfulness, and Dialectical Behaviour Therapy (DBT). Available in both
English and Spanish, Wysa serves users in 65 countries worldwide.
   Studies have shown chatbots can significantly benefit both young people and adults by reducing
symptoms of anxiety and depression and improving overall mood [38, 41, 40]. One key factor contribut-
ing to these positive outcomes is the concept of self-disclosure [42, 43]. Self-disclosure involves sharing
personal and intimate information, which plays a crucial role in building intimacy and trust between
individuals [44]. In the context of chatbots, when these digital companions engage in self-disclosure, it
not only enhances the perceived intimacy and enjoyment users experience but also fosters a deeper
emotional connection. This increased trust can make users feel more understood and supported, thereby
boosting their emotional well-being. Furthermore, users often feel more comfortable sharing their
concerns and feelings with a chatbot that demonstrates openness, which can lead to a more meaningful
and supportive interaction [45, 46]. This reciprocal sharing creates a sense of mutual understanding
and empathy, contributing positively to the user’s mental health and overall sense of connection.
   Despite their benefits, the deployment of chatbots in mental health care presents challenges and
ethical considerations. Although people who frequently use these tools trust them and their security
more than people who have never used any of them, these chatbots can also be disruptive and introduce
risks for users with sensitive questions or disclosure of information [47]. Privacy concerns, the accuracy
of the chatbots’ responses, and the need for human oversight are significant issues that researchers and
developers must address to ensure chatbots provide reliable and safe support.
   However, chatbots do not necessarily have to be emotionally involved with the individual, sometimes
they are simply useful tools for gathering information beyond applying a self-reported test [48] since
users find language is more precise in communicating their mental health issues, preferring it to rating
scales [49].


3. Hypotheses and objectives
Given the large number of existing applications of NLP and the use of more advanced techniques
such as LLM in the mental health field, our research elaborates on the following premises as scientific
hypotheses:

    • H1: NLP techniques allow identifying and tracking signs of mental disorders or emotional
      problems in user-generated text messages over time.
    • H2: LLM can provide high-quality, contextually relevant information and support to mental
      health professionals, enhancing their ability to diagnose and treat patients effectively.
2
  https://replika.ai/
3
  https://woebothealth.com/
4
  https://www.youper.ai/
5
  https://www.wysa.com/
To prove these hypotheses we asked ourselves the following research questions:

    • Q1: Can NLP models accurately identify symptoms of emotional problems in text messages?
    • Q2: Are there identifiable linguistic features or patterns that are most indicative of these symptoms
      in the Spanish language?
    • Q3: Can we utilize LLMs to induce user self-disclosure in mental health?
    • Q4: Can we assess the feasibility of collecting high-quality, clinically relevant data through
      interactions with chatbots?
    • Q5: Can we build chatbots to elicit meaningful mental-health-related information from users
      while maintaining user trust and engagement?


4. Methodology
As detailed below, in previous research work I developed resources, such as a corpus or a basic dialogue
system, which I have been able to develop in this pre-doctoral period. These resources will serve as
the foundational elements and support for future work. Additionally, several experiments have been
conducted to validate and extend these initial developments.

4.1. Dialogue system
As a preliminary work associated with a research project called BigHug6 , focused on the early detection
of disorders and misbehaviours in online social networks, a dialogue system was developed. For this
project, the author of this paper developed a novel chatbot [50] to chat about several mental disorders
for young Spanish on the Telegram Platform. The most novel aspect of this chat, apart from its ability
to converse in Spanish, is that it allows for both closed and open dialogue and also does not present
itself as a therapist but as one more teenager who wants to talk about his or her problems. For the open
dialogue, we integrated the Generative Pre-trained Transformer (GPT-3) trained mostly on English
texts, so we also used DeepL7 to translate. For the controlled dialogue, we used some questions and
sentences established by psychologists and specialists in mental disorders in teenagers.
   The dialogue system creation involved a major collaboration, where the full development of the
dialogue system was carried out by the author of this paper and now forms a key part of the thesis.
It will be a central component of ongoing and future research work related to H2 above. The basic
corpus obtained from this experimentation has been used in initial experiments to test and refine the
functionalities of the system and to detect needs and strengths. Moreover, nowadays there are generative
models of language with greater capacity, which is why we propose a more updated development
focused on the needs of a therapist, focusing on the ability to contextualise and synthesise useful
information.

4.2. Developed dataset
A new extensive dataset entitled MentalRiskES [27] was developed which contains threads of messages
in Spanish. Three collections of data for evaluating early risk detection in three mental disorders (ED,
depression, and anxiety) contain more than 45,000 messages sent by over 1,300 subjects from various
Telegram groups. This data was annotated crowdsourcing according to the definition of these disorders
by remarkable organizations such as WHO and the symptoms of that disorder. So 10 annotators labelled
each subject (their last 50 or 100 messages sent to the platform) according to the annotation guideline.
This dataset was used in the shared task with the same name MentalRiskES [28] hosted at IberLEF
(editions 39 and 40) and is available upon request via GitHub8 .

6
  https://bighug.ujaen.es/
7
  https://www.deepl.com/en/docs-api/
8
  https://github.com/sinai-uja/corpusMentalRiskEs
   The creation of this collection of Spanish-language message threads aimed at early risk detection
for mental health disorders is directly relevant to H1 of the thesis. The author of the thesis has been
directly involved in all phases of the development of this corpus. As future work, it is planned to work
in more depth and analyse the dataset created and to develop a new dataset that will allow the detection
of symptoms for certain disorders and messages for Spanish.

4.3. Participation in Shared Tasks
We have engaged in the shared task eRisk organized during CLEF conferences to test the hypothesis
and gain access to annotated data. We plan to continue to participate in future editions of the task.

    • eRisk 2022 [51]. Two of the proposed tasks were addressed: early detection of signs of pathologi-
      cal gambling, and measuring the severity of the signs of eating disorders. The approach presented
      for the first task is based on the use of sentence embeddings from Transformers with features
      related to volumetry, lexical diversity, complexity metrics, and emotion-related scores, while the
      approach for the second task is based on text similarity estimation using contextualized word
      embeddings from Transformers [52].
    • eRisk 2023 [31]. One of the proposed tasks was addressed: early detection of signs of pathological
      gambling. The approach presented is based on pre-trained models from Transformers architecture
      with comprehensive preprocessing data and data balancing techniques. Moreover, we integrate
      Long-short Term Memory (LSTM) architecture with automodels from Transformers [53].
    • eRisk 2024 [54]. Two of the proposed tasks were addressed: search for symptoms of depression
      and early detection of signs of anorexia. The approach presented in the first task is based on the
      use of a two-step detection approach using a transformer-based model, while the approach for
      the second is based on calculating perplexity using two transformer-based models trained with
      causal language modelling. [55].

  On the other hand, I was part of the organising committee of MentalRiskES, a shared task organized at
IberLEF (edition 39) as part of the International Conference of the Spanish Society for Natural Language
Processing (SEPLN). MentalRiskES aim to promote the early detection of mental risk disorders in
Spanish.

    • MentalRiskES 2023 [28]. We outline three detection tasks: Task 1 on eating disorders, Task 2 on
      depression, and Task 3 on an undisclosed disorder during the competition (anxiety) to observe the
      transfer of knowledge among the different disorders proposed. To establish a baseline benchmark,
      we performed experiments using three different Transformer-based models. In this edition, 37
      teams were registered from 8 different countries, 17 sent their submission and 16 wrote their
      working notes.
    • MentalRiskES 2024 [56]. We propose three detection tasks: Task 1 to detect risk for depression
      or anxiety, Task 2 for depression and anxiety but determining contextual risk factors and Task 3
      to identify whether a subject is at risk for suicidal ideation. To establish a baseline benchmark,
      we performed experiments using three different Transformer-based models. In this edition, 28
      teams were registered from 10 different countries, 12 sent their submission and 10 wrote their
      working notes.


5. Research Elements Proposed for Discussion
My research is still at the beginning of the way so there are a lot of questions to address and elements
to be proposed and discussed. Some of them are the following:

    • Early detection of mental disorders and emotional problems: What are the challenges and
      limitations of detecting signs of mental health problems in text data? Can early detection methods
      be generalised to different mental health conditions or should they be disease-specific? Is it
      possible to identify symptoms of emotional problems in text messages? How accurately can NLP
      models detect early signs of specific mental health disorders in text communications?
    • Effectiveness of dialogue systems in mental health support: What are the key features and
      functionalities that should be included in the dialogue system to ensure it is both supportive
      and safe for users? Can a chatbot be designed to collect high-quality data that is valuable and
      reliable for mental health professionals? How can the impact of the dialogue system on users’
      mental health and well-being be measured accurately? In what ways can a LLM be designed to
      encourage user self-disclosure in a manner that promotes well-being? How do users perceive the
      use of chatbots and automated systems in their mental health care? What factors influence their
      acceptance and trust, and how can these systems be designed to align with the standards and
      practices of certified mental health diagnostics and treatments?
    • Ethical and legal considerations: What are the ethical implications of using automated systems
      for detecting and responding to mental health issues? Is it possible to ensure the system does not
      inadvertently cause harm or distress to users?


Acknowledgments
My sincere thanks to my thesis tutors, Arturo Montejo-Raéz, Miguel Ángel García-Cumbreras and
Manuel García-Vega, for guiding me along this process, to the doctoral programme of the University of
Jaén and the Centre for Advanced Studies in Information and Communication Technologies (CEATIC
for its acronym in Spanish) for their support in this research experience.
   This work has been supported by project MODERATES (TED2021-130145B-I00) funded by Plan
Nacional I+D+i from the Spanish Government.


References
 [1] World Health Organization, Mental disorders, https://www.who.int/es/news-room/fact-sheets/
     detail/mental-disorders, 2022. Accessed: 06.06.2024.
 [2] A. Kumar, K. R. Nayar, Covid 19 and its mental health consequences, Journal of Mental Health 30
     (2021) 1–2. doi:10.1080/09638237.2020.1757052, pMID: 32339041.
 [3] J. Rehm, K. D. Shield, Global burden of disease and the impact of mental and addictive disorders,
     Current psychiatry reports 21 (2019) 1–7.
 [4] T. Zhang, K. Yang, S. Ji, S. Ananiadou, Emotion fusion for mental illness detection from social
     media: A survey, Information Fusion 92 (2023) 231–246.
 [5] T. Zhang, A. M. Schoene, S. Ji, S. Ananiadou, Natural language processing applied to mental illness
     detection: a narrative review, NPJ digital medicine 5 (2022) 1–13.
 [6] S. Zad, M. Heidari, H. James Jr, O. Uzuner, Emotion detection of textual data: An interdisciplinary
     survey, in: 2021 IEEE World AI IoT Congress (AIIoT), IEEE, 2021, pp. 0255–0261.
 [7] F. M. Plaza-del Arco, M. D. Molina-González, L. A. Urena-López, M. T. Martín-Valdivia, Comparing
     pre-trained language models for spanish hate speech detection, Expert Systems with Applications
     166 (2021) 114120.
 [8] X. Xu, B. Yao, Y. Dong, S. Gabriel, H. Yu, J. Hendler, M. Ghassemi, A. K. Dey, D. Wang, Mental-llm:
     Leveraging large language models for mental health prediction via online text data, Proceedings
     of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 8 (2024) 1–32.
 [9] S. C. Guntuku, D. B. Yaden, M. L. Kern, L. H. Ungar, J. C. Eichstaedt, Detecting depression and
     mental illness on social media: an integrative review, Current Opinion in Behavioral Sciences 18
     (2017) 43–49.
[10] M. Malgaroli, T. D. Hull, J. M. Zech, T. Althoff, Natural language processing for mental health
     interventions: a systematic review and research framework, Translational Psychiatry 13 (2023)
     309.
[11] S. Henry, M. Yetisgen, O. Uzuner, Natural Language Processing in Mental Health Research and
     Practice, Springer International Publishing, Cham, 2021, pp. 317–353. URL: https://doi.org/10.1007/
     978-3-030-70558-9_13. doi:10.1007/978-3-030-70558-9_13.
[12] World Health Organization, Depressive disorder (depression), https://www.who.int/news-room/
     fact-sheets/detail/depression, 2023. Accessed: 07.06.2024.
[13] D. E. Losada, F. Crestani, J. Parapar, erisk 2017: Clef lab on early risk prediction on the internet:
     experimental foundations, in: Experimental IR Meets Multilinguality, Multimodality, and Interac-
     tion: 8th International Conference of the CLEF Association, CLEF 2017, Dublin, Ireland, September
     11–14, 2017, Proceedings 8, Springer, 2017, pp. 346–360.
[14] J. Parapar, P. Martín-Rodilla, D. E. Losada, F. Crestani, Overview of erisk at clef 2021: Early risk
     prediction on the internet (extended overview)., CLEF (Working Notes) (2021) 864–887.
[15] D. E. Losada, F. Crestani, J. Parapar, Overview of erisk 2019 early risk prediction on the internet,
     in: Experimental IR Meets Multilinguality, Multimodality, and Interaction: 10th International
     Conference of the CLEF Association, CLEF 2019, Lugano, Switzerland, September 9–12, 2019,
     Proceedings 10, Springer, 2019, pp. 340–357.
[16] D. E. Losada, F. Crestani, J. Parapar, Overview of erisk: early risk prediction on the internet, in:
     Experimental IR Meets Multilinguality, Multimodality, and Interaction: 9th International Confer-
     ence of the CLEF Association, CLEF 2018, Avignon, France, September 10-14, 2018, Proceedings 9,
     Springer, 2018, pp. 343–361.
[17] D. Owen, J. C. Collados, L. Espinosa-Anke, Towards preemptive detection of depression and
     anxiety in twitter, arXiv preprint arXiv:2011.05249 (2020).
[18] A. Haque, V. Reddi, T. Giallanza, Deep learning for suicide and depression identification with
     unsupervised label correction, in: Artificial Neural Networks and Machine Learning–ICANN 2021:
     30th International Conference on Artificial Neural Networks, Bratislava, Slovakia, September
     14–17, 2021, Proceedings, Part V 30, Springer, 2021, pp. 436–447.
[19] S. Ji, X. Li, Z. Huang, E. Cambria, Suicidal ideation and mental disorder detection with attentive
     relation networks, Neural Computing and Applications 34 (2021) 10309–10319. doi:10.1007/
     s00521-021-06208-y.
[20] I. Pirina, Ç. Çöltekin, Identifying depression on Reddit: The effect of training data, in:
     G. Gonzalez-Hernandez, D. Weissenbacher, A. Sarker, M. Paul (Eds.), Proceedings of the 2018
     EMNLP Workshop SMM4H: The 3rd Social Media Mining for Health Applications Workshop
     & Shared Task, Association for Computational Linguistics, Brussels, Belgium, 2018, pp. 9–12.
     doi:10.18653/v1/W18-5903.
[21] U. Naseem, A. G. Dunn, J. Kim, M. Khushi, Early identification of depression severity levels on
     reddit using ordinal classification, in: Proceedings of the ACM Web Conference 2022, Association
     for Computing Machinery, New York, NY, USA, 2022, p. 2563–2572. doi:10.1145/3485447.
     3512128.
[22] M. Gaur, A. Alambo, J. P. Sain, U. Kursuncu, K. Thirunarayan, R. Kavuluru, A. Sheth, R. Welton,
     J. Pathak, Knowledge-aware assessment of severity of suicide risk for early intervention, in: The
     World Wide Web Conference, Association for Computing Machinery, New York, NY, USA, 2019, p.
     514–525. doi:10.1145/3308558.3313698.
[23] D. E. Losada, F. Crestani, A test collection for research on depression and language use, in: N. Fuhr,
     P. Quaresma, T. Gonçalves, B. Larsen, K. Balog, C. Macdonald, L. Cappellato, N. Ferro (Eds.),
     Experimental IR Meets Multilinguality, Multimodality, and Interaction, Springer International
     Publishing, Cham, 2016, pp. 28–39.
[24] K. Harrigian, C. Aguirre, M. Dredze, On the state of social media data for mental health research,
     arXiv preprint arXiv:2011.05233 (2020).
[25] A. K. Singh, U. Arora, S. Shrivastava, A. Singh, R. R. Shah, P. Kumaraguru, et al., Twitter-stmhd:
     An extensive user-level database of multiple mental health disorders, in: Proceedings of the
     International AAAI Conference on Web and Social Media, volume 16, 2022, pp. 1182–1191.
[26] M. E. Villa-Pérez, L. A. Trejo, M. B. Moin, E. Stroulia, Extracting mental health indicators from
     english and spanish social media: A machine learning approach, IEEE Access 11 (2023) 128135–
     128152.
[27] A. M. Mármol Romero, A. Moreno Muñoz, F. M. Plaza-del Arco, M. D. Molina González, M. T.
     Martín Valdivia, L. A. Ureña-López, A. Montejo Ráez, MentalRiskES: A new corpus for early
     detection of mental disorders in Spanish, in: N. Calzolari, M.-Y. Kan, V. Hoste, A. Lenci, S. Sakti,
     N. Xue (Eds.), Proceedings of the 2024 Joint International Conference on Computational Linguistics,
     Language Resources and Evaluation (LREC-COLING 2024), ELRA and ICCL, Torino, Italia, 2024,
     pp. 11204–11214. URL: https://aclanthology.org/2024.lrec-main.978.
[28] A. M. Mármol-Romero, A. Moreno-Muñoz, F. M. Plaza-del Arco, M. D. Molina-González, M. T.
     Martín-Valdivia, L. A. Ureña-López, A. Montejo-Raéz, Overview of MentalriskES at IberLEF 2023:
     Early Detection of Mental Disorders Risk in Spanish, Procesamiento del Lenguaje Natural 71
     (2023) 329–350.
[29] L. H. Gilpin, D. Bau, B. Z. Yuan, A. Bajwa, M. Specter, L. Kagal, Explaining explanations: An
     overview of interpretability of machine learning, in: 2018 IEEE 5th International Conference on
     data science and advanced analytics (DSAA), IEEE, 2018, pp. 80–89.
[30] A. S. Uban, B. Chulvi, P. Rosso, On the explainability of automatic predictions of mental disorders
     from social media data, in: International Conference on Applications of Natural Language to
     Information Systems, Springer, 2021, pp. 301–314.
[31] J. Parapar, P. Martín-Rodilla, D. E. Losada, F. Crestani, Overview of erisk 2023: Early Risk
     Prediction on the Internet, in: International Conference of the Cross-Language Evaluation Forum
     for European Languages, Springer, 2023, pp. 294–315.
[32] A. Pérez, J. Parapar, Á. Barreiro, S. Lopez-Larrosa, Bdi-sen: A sentence dataset for clinical symptoms
     of depression, in: Proceedings of the 46th International ACM SIGIR Conference on Research and
     Development in Information Retrieval, 2023, pp. 2996–3006.
[33] Z. Zhang, S. Chen, M. Wu, K. Zhu, Symptom identification for interpretable detection of multiple
     mental disorders on social media, in: Y. Goldberg, Z. Kozareva, Y. Zhang (Eds.), Proceedings
     of the 2022 Conference on Empirical Methods in Natural Language Processing, Association for
     Computational Linguistics, Abu Dhabi, United Arab Emirates, 2022, pp. 9970–9985. doi:10.18653/
     v1/2022.emnlp-main.677.
[34] S. Siddique, J. C. L. Chow, Machine learning in healthcare communication, Encyclopedia 1 (2021)
     220–239. doi:10.3390/encyclopedia1010021.
[35] J. C. Chow, V. Wong, L. Sanders, K. Li, Developing an ai-assisted educational chatbot for ra-
     diotherapy using the ibm watson assistant platform, in: Healthcare, volume 11, MDPI, 2023, p.
     2417.
[36] A. I. Jabir, L. Martinengo, X. Lin, J. Torous, M. Subramaniam, L. Tudor Car, Evaluating con-
     versational agents for mental health: scoping review of outcomes and outcome measurement
     instruments, Journal of Medical Internet Research 25 (2023) e44548.
[37] M. Skjuve, A. Følstad, K. I. Fostervold, P. B. Brandtzaeg, My chatbot companion-a study of human-
     chatbot relationships, International Journal of Human-Computer Studies 149 (2021) 102601.
[38] K. K. Fitzpatrick, A. Darcy, M. Vierhile, Delivering cognitive behavior therapy to young adults
     with symptoms of depression and anxiety using a fully automated conversational agent (woebot):
     A randomized controlled trial, JMIR Ment Health 4 (2017) e19. doi:10.2196/mental.7785.
[39] A. Mehta, A. N. Niles, J. H. Vargas, T. Marafon, D. D. Couto, J. J. Gross, Acceptability and
     effectiveness of artificial intelligence therapy for anxiety and depression (youper): Longitudinal
     observational study, Journal of medical Internet research 23 (2021) e26771.
[40] B. Inkster, S. Sarda, V. Subramanian, An empathy-driven, conversational artificial intelligence
     agent (wysa) for digital mental well-being: Real-world data evaluation mixed-methods study, JMIR
     Mhealth Uhealth 6 (2018) e12106. doi:10.2196/12106.
[41] O. Romanovskyi, N. Pidbutska, A. Knysh, Elomia chatbot: The effectiveness of artificial intelligence
     in the fight for mental health., in: COLINS, 2021, pp. 1215–1224.
[42] A.-K. Reuel, S. Peralta, J. Sedoc, G. Sherman, L. Ungar, Measuring the language of self-disclosure
     across corpora, in: S. Muresan, P. Nakov, A. Villavicencio (Eds.), Findings of the Association for
     Computational Linguistics: ACL 2022, Association for Computational Linguistics, Dublin, Ireland,
     2022, pp. 1035–1047. URL: https://aclanthology.org/2022.findings-acl.83. doi:10.18653/v1/2022.
     findings-acl.83.
[43] T. Blose, P. Umar, A. Squicciarini, S. Rajtmajer, Privacy in crisis: A study of self-disclosure during
     the coronavirus pandemic, 2020. URL: https://arxiv.org/abs/2004.09717. arXiv:2004.09717.
[44] D. Catona, K. Greene, Self-Disclosure, 2015. doi:10.1002/9781118540190.wbeic162.
[45] Y.-C. Lee, N. Yamashita, Y. Huang, W. Fu, "i hear you, i feel you": Encouraging deep self-disclosure
     through a chatbot, in: Proceedings of the 2020 CHI Conference on Human Factors in Computing
     Systems, CHI ’20, Association for Computing Machinery, New York, NY, USA, 2020, p. 1–12.
     doi:10.1145/3313831.3376175.
[46] J. Meng, N. Dai, Emotional support from ai chatbots: Should a supportive partner self-disclose
     or not?, Journal of Computer-Mediated Communication 26 (2021) 207–222. doi:10.1093/jcmc/
     zmab005.
[47] P. Chametka, S. Maqsood, S. Chiasson, Security and privacy perceptions of mental health chatbots,
     in: 2023 20th Annual International Conference on Privacy, Security and Trust (PST), IEEE, 2023,
     pp. 1–7.
[48] A. Schick, J. Feine, S. Morana, A. Maedche, U. Reininghaus, Validity of chatbot use for mental
     health assessment: experimental study, JMIR mHealth and uHealth 10 (2022) e28082.
[49] V. Varadarajan, S. Sikström, O. Kjell, H. Schwartz, ALBA: Adaptive language-based assessments
     for mental health, in: K. Duh, H. Gomez, S. Bethard (Eds.), Proceedings of the 2024 Conference of
     the North American Chapter of the Association for Computational Linguistics: Human Language
     Technologies (Volume 1: Long Papers), Association for Computational Linguistics, Mexico City,
     Mexico, 2024, pp. 2466–2478. URL: https://aclanthology.org/2024.naacl-long.136.
[50] A. M. Mármol-Romero, M. García-Vega, M. Á. García-Cumbreras, A. Montejo-Ráez, An empathic
     gpt-based chatbot to talk about mental disorders with spanish teenagers, International Journal of
     Human–Computer Interaction (2024) 1–17.
[51] P. Martın-Rodilla, D. E. Losada, F. Crestani, Overview of eRisk 2022: Early Risk Prediction on
     the Internet, in: Experimental IR Meets Multilinguality, Multimodality, and Interaction: 13th
     International Conference of the CLEF Association, CLEF 2022, Bologna, Italy, September 5–8, 2022,
     Proceedings, volume 13390, Springer Nature, 2022, p. 233.
[52] A. M. Mármol-Romero, S. M. J. Zafra, F. M. P. del Arco, M. D. Molina-González, M. T. M. Valdivia,
     A. Montejo-Ráez, Sinai at erisk@ clef 2022: Approaching early detection of gambling and eating
     disorders with natural language processing., in: Working Notes of CLEF), 2022, pp. 961–971.
[53] A. M. Mármol-Romero, F. del Arco, A. Montejo-Ráez, Sinai at erisk@ clef 2023: Approaching early
     detection of gambling with natural language processing, Working Notes of CLEF (2023) 18–21.
[54] J. Parapar, P. Martín-Rodilla, D. E. Losada, F. Crestani, erisk 2024: Depression, anorexia, and
     eating disorder challenges, in: Advances in Information Retrieval: 46th European Conference
     on Information Retrieval, ECIR 2024, Glasgow, UK, March 24–28, 2024, Proceedings, Part V,
     Springer-Verlag, Berlin, Heidelberg, 2024, p. 474–481. doi:10.1007/978-3-031-56069-9_65.
[55] A. M. Mármol-Romero, A. Moreno Muñoz, P. Álvarez-Ojeda, K. M. Valencia-Segura, M.-C. Eugenio,
     M. García-vega, A. Montejo-Ráez, Sinai at erisk@ clef 2024: Approaching the search for symptoms
     of depression and early detection of anorexia signs using natural language processing., in: Working
     Notes of CLEF, 2024. To appear.
[56] A. M. Mármol-Romero, A. Moreno-Muñoz, F. M. Plaza-del Arco, M. D. Molina-González, M. T.
     Martín-Valdivia, L. A. Ureña-López, A. Montejo-Raéz, Overview of MentalriskES at IberLEF 2024:
     Early Detection of Mental Disorders Risk in Spanish, Procesamiento del Lenguaje Natural 73
     (2024). To appear.