<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <issn pub-type="ppub">1613-0073</issn>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Transformer Models for Detecting Suicidal Ideation in Social Media Texts</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Viankail Cedillo-Castelán</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Workshop</string-name>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>National Autonomous University of Mexico, C.U. Coyoacán</institution>
          ,
          <addr-line>04510, Mexico City</addr-line>
          ,
          <country country="MX">Mexico</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>This study is a contribution of the MentalRiskES 2024 task 3 which explores the detection of suicidal ideation using a range of machine learning models encompassing both traditional algorithms and advanced Transformer models. We assess several approaches including Logistic Regression, Naïve Bayes, Random Forest, and variants of the RoBERTuito model, focusing particularly on the adaptation of the RoBERTuito base model for text analysis and sentiment analysis in Spanish. Results indicate that the RoBERTuito base model outperforms other models in accuracy and performance metrics, demonstrating its potential for critical mental health applications in Spanish-speaking contexts.</p>
      </abstract>
      <kwd-group>
        <kwd>tive analytics in healthcare</kwd>
        <kwd>Suicidal ideation detection</kwd>
        <kwd>Natural language processing (NLP)</kwd>
        <kwd>Transformer models</kwd>
        <kwd>RoBERTuito model</kwd>
        <kwd>Predic-</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        Mental health and wellness are integral to overall well-being. Mental health research is vital as it
influences our emotional, psychological, and social functioning. Mental illnesses, often perceived
as personal and isolating struggles, are also significant public health concerns. Society must treat
mental disorders with the same urgency and care as chronic medical conditions, recognizing their
high treatability and the potential for recovery in many individuals [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. A better understanding of
mental health disorders could lead to improved treatments, interventions, and public health strategies,
enhancing mental wellness. Maintaining good mental health is crucial for managing stress, relating
to others, and making decisions. With rising global concerns about mental health, there is an urgent
need for advanced methods to detect severe mental health conditions early and intervene appropriately.
This is particularly crucial for conditions like suicidal ideation—a severe and often hidden symptom
associated with many mental health disorders.
      </p>
      <p>
        Suicidal ideation involves thoughts of self-harm or ending one’s life and is a critical mental health
challenge with potentially fatal outcomes if unaddressed. This issue afects a diverse range of individuals
and is often linked to mental health disorders, though it can also arise from situational stressors. Early
recognition of suicidal ideation and understanding its root causes are essential for improving patient
outcomes and reducing suicide risks [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. The World Health Organization (WHO) reports that over
700,000 individuals die by suicide each year, with about one completed suicide for every twenty attempts
and many more experiencing suicidal thoughts [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ].
      </p>
      <p>
        The expression and detection of suicidal intent can be facilitated through the analysis of social media
posts using natural language processing (NLP) and machine learning (ML) algorithms [
        <xref ref-type="bibr" rid="ref4 ref5">4, 5</xref>
        ]. Progress
in this field, as shown by researchers like Reece &amp; Danforth [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ], has been significant. Yet despite
advancements in NLP and ML that enable the analysis of textual content for signs of mental health
concerns, much of the research, including studies by Coppersmith et al. [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ], has focused on specific
demographic data. Mental health is deeply influenced by biological, economic, social, and ethnic factors.
      </p>
      <p>CEUR</p>
      <p>ceur-ws.org</p>
      <p>While some studies have begun to analyze mental health based on gender and social status, attention to
diferent ethnicities remains limited. Notably, research on suicidal ideation among Hispanic populations
is scarce.</p>
      <p>This study aims to address this gap by evaluating various ML models in NLP, including traditional
classifiers like Logistic Regression, Naïve Bayes, and Random Forest, as well as advanced
Transformerbased models such as RoBERTuito, which are specially designed to understand the nuances of mental
health discourse in Spanish. The efectiveness of RoBERTuito in detecting suicidal thoughts in social
media texts underscores its potential in public health strategies for suicide prevention, highlighting the
need for culturally relevant mental health interventions within Spanish-speaking communities.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Methods</title>
      <sec id="sec-2-1">
        <title>2.1. Data Collection and Usage</title>
        <p>
          For the development of this analysis, the task organizers provided only the test dataset for Task 3; the
same procedure was followed in previous corpus [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ]. Consequently, to train and validate the models,
it was necessary to compile a training dataset from external sources. The primary criteria for data
selection were the relevance to the task, the quality and reliability of the data, and the ease of access
and use in compliance with legal and ethical standards.
        </p>
        <p>
          Given these requirements, the training dataset for Task 1 (Disorder detection) for the MentalRiskES
[
          <xref ref-type="bibr" rid="ref8">8</xref>
          ] at IberLEF 2024 [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ] consisting of 727 social media messages in Spanish language was used along
with other complementing datasets from published literature. Articles were selected based on their
content relevance to the task’s domain, which involves identification of suicidal ideation on social
media posts. This approach was guided by the need to train our models on data that closely mirrors the
characteristics of the test set provided by the task organizers.
        </p>
        <p>
          Publicly available data was taken from the published tables from studies on the recognition of suicidal
intent among depressed populations consisting of 20 social media messages in English language [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ]
and the screening for suicide risk using social media content consisting of 19 social media messages in
Spanish language [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ].
        </p>
        <p>
          To adapt the datasets found in the previously mentioned literature, social media content was translated
from its original version in English to Spanish using neural machine translation tools, specifically Google
Translate and DeepL. These tools were selected based on their proven efectiveness and accessibility as
documented in previous studies [
          <xref ref-type="bibr" rid="ref11 ref12">11, 12</xref>
          ]. The use of these freely available online services allowed for a
consistent and eficient translation process, ensuring that the context captured in the original text was
preserved. This approach aligns with methodologies employed in prior research [
          <xref ref-type="bibr" rid="ref11 ref12">11, 12</xref>
          ].
        </p>
        <p>Model training and evaluation involved splitting the data into training and testing sets using
train_test_split. The model was trained on a sub-portion (80%) of the collected training set and
evaluated on another sub-portion (20%) of the collected training set that was called the testing set.</p>
      </sec>
      <sec id="sec-2-2">
        <title>2.2. Computational Environment</title>
        <p>The analysis was conducted using Python 3.9.6. Below are the primary libraries that were utilized for
traditional classifiers such as logistic regression, random forest, Naïve Bayes, and SVC models:
• General Libraries: Pandas, NumPy, scikit-learn (version 0.24.1), Nltk, Genism, SciPy (version
1.6.0), os, re, json.
• Specific Tools for Text Feature Extraction: TfidfVectorizer, CountVectorizer, Word2Vec (via
gensim).</p>
        <p>• Miscellaneous: TensorFlow, Keras, transformers, datasets, accelerate.</p>
      </sec>
      <sec id="sec-2-3">
        <title>2.3. Model Training and Evaluation</title>
        <p>For the logistic regression model, text data were preprocessed using TfidfVectorizer and CountVectorizer.
For the implementation of the Naive Bayes classification model, the Multinomial Naive Bayes algorithm
provided by the scikit-learn library in Python was utilized. For the random forest model, text data were
converted into numerical vectors using the word2vec-google-news-300 model from gensim, focusing
on semantic content. Stopwords were removed from the text data in the random forest model using the
nltk library, which provides a comprehensive list of stopwords for various languages. For the LSTM
classifier, the TensorFlow and Keras frameworks were employed to construct and train the model.</p>
        <p>To address the complexities of language and sentiment analysis inherent in this task, a transition to
utilizing transformer-based models was done.</p>
        <p>For the classification tasks involving RoBERTa (Robustly Optimized BERT Pretraining Approach),
we utilized the transformers library. RoBERTa, an optimized version of BERT, enhances pretraining
techniques to achieve more robust performance across a wider range of tasks. The components utilized
in this model setup included:
• RoBERTa: RobertaTokenizer, TFRobertaForSequenceClassification.
• RoBERTuito sentiment analysis and RoBERTuito base pretrained models:
AutoModelForSequenceClassification, AutoTokenizer, Trainer, TrainingArguments.
2.3.1. Model Implementation
Multinomial Naive Bayes: This model was used for the classification with discrete features (word
counts for text classification). We trained this model with and without smoothing (alpha parameter).
The efectiveness of the model was evaluated using accuracy, precision, recall, and f1-score for the
standard version and area under the curve (AUC) for the version with smoothing.</p>
        <p>Logistic Regression, Random Forest, and SVC: These models were applied after transforming the
trained dataset by TfidfVectorizer. For the SVC model, the impact of document length as an additional
feature was explored by combining it with the tf-idf matrix using hstack. Both models were evaluated
based on the AUC metric, which provides an aggregate measure of performance across all classification
thresholds. Additional features like document length and digit count to the feature set for the Logistic
Regression model were implemented to observe any potential improvement in model performance.
Finally, for a deeper linguistic analysis, pre-trained word embeddings from the
word2vec-google-news300 model were used to convert messages into average vector representations. Both Logistic Regression
and Random Forest classifiers were trained on these embeddings.</p>
        <p>LSTM: TensorFlow and Keras frameworks were employed due to their advanced support for deep
learning applications. The model architecture was built using the Sequential API, which included
several layers crucial for text processing and sequential learning:
• An Embedding layer to transform text inputs into dense vectors of fixed size, capturing semantic
information.
• An LSTM layer to learn long-term dependencies within the text data.
• A SpatialDropout1D layer to prevent overfitting by dropping entire 1D feature maps during
training.</p>
        <p>• A Dense layer to produce the probability distribution over the target classes.</p>
        <p>Preprocessing of text data involved tokenizing the texts and padding them to a uniform length to
ensure consistent input dimensions across all samples. This was facilitated by Keras’ text preprocessing
utilities. Additionally, categorical target variables were encoded using scikit-learn’s LabelEncoder,
adapting them for model training. The model’s performance was evaluated using standard metrics
from scikit-learn, including accuracy score and a detailed classification report, providing insights into
the precision, recall, and F1-scores across diferent classes.</p>
        <p>RoBERTa: The RoBERTa pretrained model on the collected dataset for training aimed at identifying
indications of suicidal ideation in social media text achieved an overall accuracy of 66%. The model
exhibited a macro precision of 66.07% and a macro recall of 65.85%, resulting in a macro F1-score of
65.81%. The micro-averaged precision, recall, and F1-score all stand at 66%.</p>
        <p>RoBERTuito sentiment analysis: The fine-tuning of the RoBERTuito sentiment analysis model on
the collected dataset for training aimed at identifying indications of suicidal ideation in social media
text achieved an overall accuracy of 74.65%. The model exhibited a macro precision of 74.65% and a
macro recall of 75.14%, resulting in a macro F1-score of 74.59%. The micro-averaged precision, recall,
and F1-score all stand at 74.65%.</p>
        <p>RoBERTuito base uncased: The fine-tuning of the RoBERTuito base uncased model on the collected
dataset for training aimed at identifying indications of suicidal ideation in social media text achieved
an overall accuracy of 89.61%. The model exhibited a macro precision of 94.59% and a macro recall of
63.64%, resulting in a macro F1-score of 68.57%. The micro-averaged precision, recall, and F1-score all
stand at 89.61%.</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. Results</title>
      <p>Traditional classifiers: Multinomial Naive Bayes, Logistic Regression, Random Forest, and
SVC demonstrated suboptimal performance. Specifically, the Multinomial Naïve Bayes model achieved
a modest accuracy of 57.93%, and the SVC model slightly better at 59.8%. The Logistic Regression model
performed somewhat more efectively, reaching an accuracy of 63%. Notably, the Random Forest model
outperformed the others with an accuracy of 69%.</p>
      <p>LSTM model on the collected dataset for training aimed at identifying suicidal ideation in social
media text achieved an overall accuracy of 61.6%. The model exhibited a macro precision of 61.6% and a
macro recall of 61.1%, resulting in a macro F1-score of 60.9%. The micro-averaged precision, recall, and
F1-score all stand at 61%.</p>
      <p>RoBERTa pretrained model on the collected dataset for training aimed at identifying indications
of suicidal ideation in social media text achieved an overall accuracy of 66%. The model exhibited a
macro precision of 66.07% and a macro recall of 65.85%, resulting in a macro F1-score of 65.81%. The
micro-averaged precision, recall, and F1-score all stand at 66%.</p>
      <p>RoBERTuito sentiment analysis fine-tuned model on the collected dataset for training aimed at
identifying indications of suicidal ideation in social media text achieved an overall accuracy of 74.65%.
The model exhibited a macro precision of 74.65% and a macro recall of 75.14%, resulting in a macro
F1-score of 74.59%. The micro-averaged precision, recall, and F1-score all stand at 74.65%.</p>
      <p>RoBERTuito base uncased fine-tuned model on the collected dataset for training aimed at
identifying indications of suicidal ideation in social media text achieved an overall accuracy of 89.61%. The
model exhibited a macro precision of 94.59% and a macro recall of 63.64%, resulting in a macro F1-score
of 68.57%. The micro-averaged precision, recall, and F1-score all stand at 89.61%.</p>
    </sec>
    <sec id="sec-4">
      <title>4. Discussion</title>
      <p>The efectiveness of various classifiers was evaluated in this study, including both traditional models
and advanced deep learning approaches, in detecting indications of suicidal ideation within social media
posts. Our analysis highlights the varying degrees of success and points to potential areas for further
optimization.</p>
      <p>Traditional classifiers, including Multinomial Naïve Bayes, SVC, Logistic Regression, and Random
Forest, and even LSTM, demonstrated a range of efectiveness, with accuracies ranging from 57.93%
to 69%. These models, while historically robust across various text classification tasks, fell short in
this context, particularly in handling the complexities of mental health data. The highest performing
among them, the Random Forest model, achieved an accuracy of 69%; however, this performance still
highlights significant limitations, underscoring the need to explore more sophisticated deep learning
models capable of handling the complexities of mental health data, particularly in a delicate topic such
as suicidal ideation.
RoBERTuito Base Uncased</p>
      <p>The RoBERTa model achieved an accuracy of 66% on the collected training dataset, which indicates
that it correctly predicts the binary labels for only two-thirds of the dataset. While this demonstrates
some capability to generalize from training data to unseen data, the accuracy suggests that there might
be substantial room for improvement.</p>
      <p>The RoBERTuito sentiment analysis fine-tuned model achieved an accuracy of 74.65% on the collected
training dataset, indicating a relatively high level of correctness in classifying text as either indicative or
non-indicative of suicidal ideation. This suggests that the model is efectively leveraging its sentiment
analysis training to make accurate predictions in a new closely related domain. Nevertheless, there is
still potential for enhancement.</p>
      <p>The election of RoBERTuito base uncased after trying RoBERTuito Sentiment Analysis model was
due to the nature of the multiclass classifier of RoBERTuito Sentiment Analysis. Since it has been
specifically trained to classify text into multiple sentiment categories such as positive, negative, and
neutral, it may have still a bias even after indicating a binary classification.</p>
      <p>In contrast, RoBERTuito Base Uncased is not specifically a multiclass classifier but a more
generalpurpose model pretrained on a language modeling task without specific tuning for sentiment analysis
or any other specialized classification task. Here it was taken as a foundational model that was adapted
to this specific binary classification.</p>
      <p>RoBERTuito Base Uncased resulted in an accuracy of 89.61% on the test dataset of our collected
training data, which indicates that the model is efectively generalizing from the training data to the
unseen test data, demonstrating robust performance in distinguishing between texts that either suggest
or do not suggest suicidal ideation.</p>
      <p>The macro-averaged precision of 94.59% indicates that, on average, the model’s predictions of each
class are reliable. However, the macro-averaged recall of 63.64% suggests that the model, while precise,
is conservative in predicting positive instances and could be missing a proportion of actual positive
cases. Nevertheless, the similarity between micro-averaged metrics and overall accuracy suggests that
the model performs uniformly across all instances, providing confidence in its general utility.</p>
      <p>Testing our RoBERTuito Base Uncased fine-tuned model on the testing data provided by MentalRiskES,
the performance of the was evaluated across three runs in comparison to other teams and a baseline
model. The results demonstrated that our RoBERTuito Base Uncased fine-tuned consistently achieved an
accuracy of 0.691 across all runs. In terms of macro-averaged metrics, the RoBERTuito Base Uncased
finetuned attained a Macro_Precision (Macro_P) of 0.345, Macro_Recall (Macro_R) of 0.500, and Macro_F1
score of 0.409. Similarly, for micro-averaged metrics, the RoBERTuito Base Uncased fine-tuned recorded
a Micro_Precision (Micro_P), Micro_Recall (Micro_R), and Micro_F1 score all at 0.691. The RoBERTuito
Base Uncased fine-tuned performance on the Early Risk Detection Error (ERDE) metrics showed an
ERDE5 score of 0.261 and an ERDE50 score of 0.261. Additionally, the RoBERTuito Base Uncased
ifne-tuned latency to positive prediction (latencyTP) was consistent at 0.214 with a speed of 1 and a
latency-weighted F1 score of 0.817. See Table 1. Overall, our RoBERTuito Base Uncased fine-tuned
performance was comparable across all runs, maintaining a balance between precision and recall with
strong latency-weighted F1 scores, highlighting the consistency and reliability of their approach in this
evaluation.</p>
      <p>While traditional classifiers such as Logistic Regression, Naïve Bayes, and Random Forest have
demonstrated utility in handling a variety of text classification tasks, the evolution of deep learning has
introduced more sophisticated models that can capture intricate patterns in large datasets. This shift
marks a significant advancement in the field of natural language processing.</p>
    </sec>
    <sec id="sec-5">
      <title>5. Conclusion and Future Directions</title>
      <p>While the transition from traditional classifiers to sophisticated deep learning models like RoBERTuito
represents a significant leap forward in our capacity to process and analyze mental health data, future
eforts need to continue. Particularly in Hispanic-speaking communities where little research has
been performed; the challenge of accurately detecting suicidal ideation cannot be focused on specific
demographic groups. The results of this study underscore the pressing need for continued enhancements,
especially in improving recall without sacrificing precision, to efectively support suicide prevention
eforts and provide timely interventions in Hispanic-speaking populations where such resources are
critically needed.</p>
    </sec>
    <sec id="sec-6">
      <title>6. Limitations</title>
      <p>When testing the RoBERTuito base uncased fine-tuned model on the actual test dataset provided by the
MentalRiskES committee at IberLEF 2024, the metrics resulted in an accuracy of 69.1%, a macro-averaged
precision of 34.5%, a macro-averaged recall of 50%, and a macro-averaged F1 score of 40.9%; and a
micro-averaged precision of 69.1%, a micro-averaged recall of 69.1%, and a micro-averaged F1 score of
69.1%. This could be due to the diferences between the training and the test datasets. This approach
was guided by the need to train our models on data that closely mirrors the characteristics of the test
set provided by the task organizers, albeit the exact content and context might difer.</p>
    </sec>
    <sec id="sec-7">
      <title>Acknowledgments References</title>
      <p>This research was supported by the National Autonomous University of Mexico. Special thanks to the
MentalRiskES 2024 task organizers and the IberLEF 2024 committee.</p>
    </sec>
    <sec id="sec-8">
      <title>A. Online Resources</title>
      <p>The sources for the ceur-art style are available via:
• GitHub
• Overleaf template</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>S. K.</given-names>
            <surname>Galson</surname>
          </string-name>
          , Mental health matters,
          <source>Public Health Reports</source>
          <volume>124</volume>
          (
          <year>2009</year>
          )
          <fpage>189</fpage>
          -
          <lpage>191</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>B.</given-names>
            <surname>Harmer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T. V. H.</given-names>
            <surname>Duong</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Saadabadi</surname>
          </string-name>
          , Suicidal ideation,
          <source>StatPearls</source>
          (
          <year>2020</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>W. H.</given-names>
            <surname>Organization</surname>
          </string-name>
          , Suicide, Fact
          <string-name>
            <surname>Sheets</surname>
          </string-name>
          (
          <year>2023</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>J.</given-names>
            <surname>Jashinsky</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. H.</given-names>
            <surname>Burton</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C. L.</given-names>
            <surname>Hanson</surname>
          </string-name>
          , J. West,
          <string-name>
            <given-names>C.</given-names>
            <surname>Giraud-Carrier</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. D.</given-names>
            <surname>Barnes</surname>
          </string-name>
          , T. Argyle,
          <article-title>Tracking suicide risk factors through twitter in the us</article-title>
          ,
          <source>Crisis</source>
          (
          <year>2014</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>T. H.</given-names>
            <surname>Aldhyani</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. N.</given-names>
            <surname>Alsubari</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. S.</given-names>
            <surname>Alshebami</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Alkahtani</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z. A.</given-names>
            <surname>Ahmed</surname>
          </string-name>
          ,
          <article-title>Detecting and analyzing suicidal ideation on social media using deep learning and machine learning models</article-title>
          ,
          <source>International journal of environmental research and public health 19</source>
          (
          <year>2022</year>
          )
          <fpage>12635</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>A. G.</given-names>
            <surname>Reece</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C. M.</given-names>
            <surname>Danforth</surname>
          </string-name>
          ,
          <article-title>Instagram photos reveal predictive markers of depression</article-title>
          ,
          <source>EPJ Data Science</source>
          <volume>6</volume>
          (
          <year>2017</year>
          )
          <fpage>15</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>G.</given-names>
            <surname>Coppersmith</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Leary</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Crutchley</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Fine</surname>
          </string-name>
          ,
          <article-title>Natural language processing of social media as screening for suicide risk</article-title>
          ,
          <source>Biomedical informatics insights 10</source>
          (
          <year>2018</year>
          )
          <fpage>1178222618792860</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>A. M.</given-names>
            <surname>Mármol-Romero</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Moreno-Muñoz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F. M. P.</given-names>
            <surname>del Arco</surname>
          </string-name>
          ,
          <string-name>
            <surname>M. D. M. González</surname>
          </string-name>
          ,
          <string-name>
            <surname>M. T. M. Valdivia</surname>
            ,
            <given-names>L. A.</given-names>
          </string-name>
          <string-name>
            <surname>Ureña-López</surname>
            ,
            <given-names>A. M.</given-names>
          </string-name>
          <string-name>
            <surname>Ráez</surname>
          </string-name>
          , Overview of mentalriskes at iberlef 2024:
          <article-title>Early detection of mental disorders risk in spanish</article-title>
          ,
          <source>in: Procesamiento del Lenguaje Natural</source>
          , volume
          <volume>73</volume>
          ,
          <year>2024</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>R. F. Chiruzzo L.</given-names>
            ,
            <surname>Jiménez-Zafra</surname>
          </string-name>
          <string-name>
            <surname>S. M.</surname>
          </string-name>
          ,
          <article-title>Proceedings of the iberian languages evaluation forum (iberlef 2024) co-located with the 40th conference of the spanish society for natural language processing</article-title>
          (sepln
          <year>2024</year>
          ),
          <article-title>CEUR-WS</article-title>
          .org,
          <year>2024</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>S. B.</given-names>
            <surname>Hassan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. B.</given-names>
            <surname>Hassan</surname>
          </string-name>
          , U. Zakia,
          <article-title>Recognizing suicidal intent in depressed population using nlp: a pilot study</article-title>
          ,
          <source>in: 2020 11th IEEE Annual Information Technology, Electronics and Mobile Communication Conference (IEMCON)</source>
          , IEEE,
          <year>2020</year>
          , pp.
          <fpage>0121</fpage>
          -
          <lpage>0128</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>E.</given-names>
            <surname>Steigerwald</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Ramírez-Castañeda</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. Y.</given-names>
            <surname>Brandt</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Báldi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. T.</given-names>
            <surname>Shapiro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Bowker</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. D.</given-names>
            <surname>Tarvin</surname>
          </string-name>
          ,
          <article-title>Overcoming language barriers in academia: Machine translation tools and a vision for a multilingual future</article-title>
          ,
          <source>BioScience</source>
          <volume>72</volume>
          (
          <year>2022</year>
          )
          <fpage>988</fpage>
          -
          <lpage>998</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>L.</given-names>
            <surname>Bowker</surname>
          </string-name>
          ,
          <article-title>Promoting linguistic diversity and inclusion</article-title>
          ,
          <source>The International Journal of Information, Diversity, Inclusion</source>
          <volume>5</volume>
          (
          <year>2021</year>
          )
          <fpage>127</fpage>
          -
          <lpage>151</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>