<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Written Goodbyes: How Genre and Sociolinguistic Factors Influence the Content and Style of Suicide Notes</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Lucia Busso</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Claudia Roberta Combei</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Aston Institute for Forensic Linguistics, Aston University</institution>
          ,
          <addr-line>Birmingham</addr-line>
          ,
          <country country="UK">UK</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Dipartimento di Studi Umanistici, Università di Pavia</institution>
          ,
          <country country="IT">Italy</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>The study analyses a novel corpus of 76 freely available English authentic suicide notes (SNs) (letters and social media posts), spanning from 1902 to 2023. By using NLP and corpus linguistics tool, this research aims at decoding patterns of content and style in SNs. In particular, we explore variation in linguistic features in SNs across sociolinguistic factors (age, gender, addressee, time period) and between text type - referred to as genre - (letters vs. online posts). To this end, we use topic models, subjectivity analysis, and sentiment and emotion analysis. Results highlight how both discourse and emotion expression, show diferences depending on genre, gender, age group and time period. We suggest a more nuanced approach to personalized prevention and intervention strategies based on insights from computer-assisted linguistic analysis.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;suicide notes</kwd>
        <kwd>topic modelling</kwd>
        <kwd>sentiment and emotion analysis</kwd>
        <kwd>subjectivity analysis</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>distinguish between genuine and elicited suicide notes.</p>
      <p>
        This – the authors claim – can in turn help developing
This paper investigates the language of suicide notes, a prediction strategy of repeated suicide attempts, as
with the goal of uncovering patterns of discourse, topics, suicide notes ofer valuable insights into specific
personand emotional expression across various sociolinguistic ality states and mindsets. Similarly, [7] suggests that
factors and relationship dynamics, spanning over 100 analysing SNs may contribute to assessing the risk of
years. A suicide note (SN) has been defined in the lit- repeated suicide attempts.
erature as "any available text by a suicide which was Despite the area being well-researched, especially in
authored shortly before death" ([
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]: 26). forensic linguistics, current analyses of SNs present
sev
      </p>
      <p>
        The importance of a detailed analysis of suicide notes eral shortcomings. Given the dificulty of accessing data,
has been acknowledged in the scholarly debate ([
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]). In scholars have either used dubious source material (such
fact, SNs have been widely studied in linguistics, soci- as the letters published on the blog "The Holy Dark"),
ology, and psychology starting with the publication in or have reused and reanalysed SNs written by famous
1959 of Osgood and Walker’s seminal work ([
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]). Since people (such as Virginia Woolf and Kurt Cobain, e.g.,
then, the language of SNs has been investigated mainly [12, 13]). Moreover, there is no study to date – to the best
through Genre Analysis ([4]), with some scholars work- of the authors’ knowledge – that analyses of SNs using
ing with corpus methods ([
        <xref ref-type="bibr" rid="ref1">1, 5</xref>
        ]). Lately, big corpora of text type, which we refer to as genre, or sociolinguistic
SNs have been collected through the Web and used for factors (such as gender, age, addressee, or time period)
computational analyses (inter alia [6, 7, 8]). as covariates.
      </p>
      <p>Research on SNs is naturally practical, being focused In the present paper, we set out to perform corpus and
on suicide prevention ([9]), identification ([ 10]), and au- computational analyses on a novel dataset of authentic
thenticity ([11]). For instance, the study by [6] uses clas- suicide notes. Specifically, we aim to explore whether
sification algorithms to help mental health professionals and to what extent SNs style and content vary according
to genre (letter vs. online post) and sociolinguistic factors
CLiC-it 2024: Tenth Italian Conference on Computational Linguistics, (the victim’s gender and age, as well as the addressee and
*DCecor0r4es—po0n6,d2in02g4a,uPtihsao,rIstaly time period of the SN). To this end, we employ Structural
†Although both authors are equally responsible for the conceptual- Topic Modelling ([14]) and keyword analysis, subjectivity
ization and the contents of this article, for the purposes of Italian analysis ([15]), and sentiment and emotion analysis ([16,
academia, we specify what follows: L. Busso authored Section 1, 17]).</p>
      <p>Section 2, and Section 3; C. R. Combei authored Section 4, Section
5, and Section 6.
$ l.busso@aston.ac.uk (L. Busso); claudiaroberta.combei@unipv.it 2. Data
(C. R. Combei)</p>
      <p>https://orcid.org/0000-0002-5665-771X (L. Busso); Despite the presence in the literature of various datasets
https://orcid.org/0000-0003-1884-8205 (C. R. Combei)
© 2024 Copyright for this paper by its authors. Use permitted under Creative Commons License of suicide letters, none - to the authors’ best knowledge
Attribution 4.0 International (CC BY 4.0).</p>
      <sec id="sec-1-1">
        <title>The resulting corpus contains 26,214 tokens, and in</title>
        <p>cludes texts from 1902 to 2023. Unavoidably, the
distribution is skewed towards more recent texts (only 5 texts and ease of interpretation, and we model the efect of the
are from before 1950, and only 14 are from before 1990. "genre" covariate on topic content (i.e. lexical content
The majority of the corpus (75%) includes SNs from 1990 used within topics) and prevalence (i.e. the frequency
to the present day). However, the corpus is balanced for with which a topic is discussed).
textual type (genre), with 43 letters (51% of the tokens) Figure 1 shows the top 10 word probabilities for the 3
and 33 social media posts (49% of the tokens). The SNs topics in the corpus. Following extensive concordance
also cover a wide range of addresses, including messages analysis to explore the keywords in context, the three
directed to family, life-partners (including ex-partners), topics have been labelled:
friends, the internet, or cases where the addressee is
unspecified.</p>
      </sec>
    </sec>
    <sec id="sec-2">
      <title>3. Topic Modelling and keywords analysis</title>
      <sec id="sec-2-1">
        <title>3.1. Structural Topic Models</title>
        <p>1. Topic 1: Explanations. This topic clusters words
related to reasons, motives, and emotions
associated with the act of suicide.
2. Topic 2: Anguish. This topic clusters words
related to the intimate feelings of pain and hurt that
accompany suicidal ideation.
3. Topic 3: Connectedness. This topic clusters words
that refer to close connections to other people in
the victim’s life.
are freely available to other researchers. Furthermore,
existing corpora are usually either very small, and hence not
suitable for quantitative analysis, or too big, and hence
not controlled for the parameters we are interested in
analysing. Therefore, we decided to collect a new dataset
of genuine suicide notes to fill this gap, and to make it
available to researchers interested in the topic. Given the
sensitivity of the topic, corpus files are available upon
requests to the authors. Using the semi-automated
software Bootcat ([18], we collected a corpus consisting of
76 suicide letters and social media posts1. The SNs have
the following characteristics:
• freely available on the open internet (i.e., not
behind paywalls or log-in platforms)
• taken from reputable news websites to ensure
authenticity (i.e. not taken from blogs or other
non-oficial sources)
• only notes that were reproduced in full (i.e. not
from extracts or quotes in other texts)</p>
        <sec id="sec-2-1-1">
          <title>Topic Models (TM) are a family of unsupervised learn</title>
          <p>ing algorithms that cluster co-occurring words across
documents into thematic nodes, or "topics" ([19]). These As mentioned above, we model the efect of genre
(letalgorithms require a substantial human input, as the top- ter vs. online post) for topical content and prevalence.
ics retrieved should be interpretable by the researcher While we find no statistical diferences ( p &gt; .05) for
topiassigning meaning to the patterns discovered ([20, 21]). cal content, some interesting diferences arise in topical</p>
          <p>In this study we use Structural Topic Modelling ([14]), prevalence, as can be seen in Table 1 and in Figure 2.
a type of TM that allows to model topics distribution as a Specifically, we observe that online posts discuss
signififunction of document-level covariates in regression-like cantly less private feelings of anguish and pain (Topic 2)
schemes. The STM analyses are performed in R[22]. We and significantly more interpersonal relationships (Topic
select a number of topics K=3 based on mathematical fit 3).</p>
        </sec>
        <sec id="sec-2-1-2">
          <title>1We based our data retrieval on the sources provided by [7], and</title>
          <p>expanded on them through targeted Google searches. For privacy
reasons, online posts were only collected if reported by newspaper
articles, and were not retrieved on social media platforms
themselves.</p>
        </sec>
      </sec>
      <sec id="sec-2-2">
        <title>3.2. Keyword Analysis</title>
        <sec id="sec-2-2-1">
          <title>To explore the corpus further, beside the "black box" of the STM algorithm, we performed a keyword analysis.</title>
        </sec>
        <sec id="sec-2-2-2">
          <title>Using SketchEngine ([23]), we extract keywords for both</title>
          <p>letters and social media posts using EnTenTen21 as
reference corpus. To ensure that we only consider words that
are used throughout the corpus, we discarded instances
with a low ARF (average reduced frequency) score ([24]).
Not surprisingly, many keywords are shared across the
two subcorpora, reflecting "universal" themes of suicidal
ideation such as apologies, goodbyes, and explanations.
However, idiosyncratic keywords paint an interesting
picture (see Figure 3), as online posts seem to display a
lower prevalence of intimate feelings, and more polarized
emotion words and swearwords.</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>4. Subjectivity analysis</title>
      <sec id="sec-3-1">
        <title>Subjectivity analysis investigates what is generally la</title>
        <p>belled as a "private state", namely opinions, feelings,
beliefs, speculations ([15]: 674), typically classifying a text
on a scale ranging from high objectivity to high
subjectivity.</p>
        <p>Our paper uses this analysis because we see
subjectivity as a relevant stylistic and content-related element,
useful for understanding suicidal ideation. Although this
is a preliminary study, we believe that findings from
subjectivity, sentiment, and emotion analysis, supported by
the exploration of psychosocial factors (not the object
of this paper), could be useful for evaluating the risk of
(repeated) suicide attempts. In particular, we expect that
highly subjective texts may signal intense personal
turmoil, which has, in fact, been reported as a potential risk
factor for suicide ([25]).</p>
        <p>This research uses the TextBlob library for Python that
provides tools for various textual analyses, including
subjectivity, as part of its sentiment analysis function2. The
tool uses a pattern analyzer and a pre-defined dictionary
of word polarity and subjectivity. It also incorporates
intensity, accounting for the impact of modifiers, which
can increase or reduce the measured subjectivity score.
Each SN is processed to extract its overall subjectivity
score that ranges from 0 (i.e., highly objective) through
1 (i.e., highly subjective). To discuss the efect of genre
and sociolinguistic factors on the subjectivity score, we
present the results of statistical analyses conducted in
R[22].</p>
        <p>First of all, the mean subjectivity score at the corpus
level (M = 0.56, SD = 0.12) indicates that SNs are
characterized by a level of subjectivity that falls above the
midpoint of the scale (0.50); there is, thus, a tendency
toward greater subjectivity than objectivity. Interestingly,
however, the mean subjectivity scores and their
distributions are nearly identical between letters (M = 0.56, SD =
0.13) and social media posts (M = 0.57, SD = 0.12).</p>
        <p>Next, based on Figure 4, SNs written from 1950-1969
seem to have the highest subjectivity score (M = 0.72, SD =
0.15). In contrast, the lowest subjectivity is found for SNs
written from 1990-1999 (M = 0.49, SD = 0.12), followed by
those from 1970-1989 (M = 0.52, SD = 0.15). SNs written
before 1950 (M = 0.56, SD = 0.11), from 2000-2019 (M =
0.56, SD = 0.11), and from 2020-now (M = 0.56, SD = 0.13)
have identical subjectivity scores.</p>
        <p>The results displayed in Figure 5 indicate that
subjectivity scores of SNs addressed to life-partners (M = 0.61,
SD = 0.06) are the highest, followed by those addressed
to family (M = 0.60, SD = 0.09). This suggests that SNs
addressed to people with whom the victim has a close
relationship are characterized by a deeper personal
engagement and a more vivid linguistic expression than
those addressed to the internet (M = 0.56, SD = 0.12), to
friends (M = 0.55, SD = 0.08), and to other addressees (M</p>
      </sec>
      <sec id="sec-3-2">
        <title>2The sentiment analysis score itself obtained from the TextBlob tool</title>
        <p>is not used in this study, as more advanced methods for investigating
sentiment are preferred (see Section 5)
sentiment and emotion analysis. Sentiment analysis is
defined as "the task of finding the opinions of authors
about specific entities" ([ 26]: 82). Emotion analysis (also
Figure 4: Subjectivity as a function of the year group emotion classification), on the other hand, is often seen
as a more refined version of sentiment analysis, since it
deals with the identification of primary emotions in a
text ([27]).</p>
        <p>
          For this research we employ the latest version available
(at the time of writing) of Twitter-roBERTa-base for
sentiment analysis, a model trained on over 124 million tweets
that is fine-tuned for this task with the TweetEval
benchmark ([
          <xref ref-type="bibr" rid="ref4">28, 29, 30</xref>
          ]). For emotion classification, we use
the Emotion English DistilRoBERTa-base model ([
          <xref ref-type="bibr" rid="ref5">31</xref>
          ]) to
extract Ekman’s six basic emotions ([
          <xref ref-type="bibr" rid="ref6">32</xref>
          ]): anger, disgust,
fear, joy, sadness, and surprise, along with a neutral class.
        </p>
        <p>The model is a fine-tuned version of DistilRoBERTa-base,
trained on six balanced datasets, each containing 2,811
observations per emotion, for a total of almost 20,000
observations.</p>
        <p>Our analysis reveals that the average probability of
Figure 5: Subjectivity as a function of addressee negative sentiment (M = 0.61, SD = 0.31) is roughly three
times higher than the average probability of neutral (M
= 0.22, SD = 0.15) and positive sentiment (M = 0.17, SD
= 0.54, SD = 0.16). The standard deviations for most ad- = 0.28). Then, the dominant sentiment in each SN is
dressees (i.e., partner, family, friends) are relatively small, determined by identifying the highest probability among
suggesting limited variation within these groups. the three sentiment classes. We find that 73% of the</p>
        <p>As regards the victim’s gender, the average subjectivity SNs have negative sentiment as the highest probability,
score for females (M = 0.58, SD = 0.11) is slightly higher 17.1% positive sentiment, and 9.2% neutral sentiment.
than the score for males (M = 0.54, SD = 0.14), but the This trend is also supported by Figure 6 that shows the
standard deviations point out that the ranges overlap to distribution of sentiment probabilities, confirming that
a large extent. Finally, no consistent tendency emerges most SNs have a higher likelihood of expressing negative
from the distribution of subjectivity scores with respect sentiment. We interpret these results as a reflection of
to the victim’s age. In fact, there is substantial varia- the emotional distress tied to both writing the suicide
tion within each age group, meaning that the degree of notes and the thoughts surrounding the act of suicide
subjectivity in SNs is influenced by other factors. itself.</p>
        <p>Some interesting tendencies are observed from the
analysis of sentiment distribution across sociolinguistic
5. Sentiment and emotion analysis factors and genre. First, Figure 7 illustrates a consistent
diference between the two genres: online posts have
In order to obtain a more fine-grained image of the emo- a higher prevalence of negative sentiment (90.9%)
comtional dimension of the SNs, and to complement the pre- pared to letters (60.5%).
viously discussed findings on the topics and subjectivity Next, all SNs from 1970-1989 show negative sentiment
of these texts, we also present and discuss the results of as being dominant (100%). A high presence of negative
sentiment (88.5%) is also present in SNs written from
2020-now. Interestingly, SNs from 1990-1999 display a
balanced sentiment distribution (50% negative and 50%
positive), marking the only period in our corpus with
such a high presence of positive sentiment. This situation
could be due to the fact that the authors of these (very
long) SNs are well-known celebrities (e.g., Kurt Cobain
and OJ Simpson). Even if the letters were not intended
for the general public, the idea these texts might
eventually become public could have influenced the victims to
transmit more positive messages.</p>
        <p>Some patterns of sentiment distribution are traceable
when considering the addressee of the SN. Positive sen- Figure 9: Emotions as a function of genre
timent is more common when the addressee is the
victim’s partner (40%) or family (35.7%). Contrarily, a very
high percentage of negative sentiment is observed in SNs
addressed to the general public on the internet (93.1%). compared to letters (40%). On the other hand, neutrality
Figure 8 shows that the negative sentiment is slightly and joy, the only two non-negative emotions, are more
more frequent in SNs written by female victims (72.7%) frequent in letters (14.6% and 8.8%, respectively) than in
compared to male victims (68.2%). As for the victim’s online posts (9.5% and 3.2%, respectively).
age, a distinct pattern is dificult to identify, but, negative The analysis reveals that sadness is the most prevalent
sentiment is the most frequent (over 65%) in SNs written emotion across all time periods. In particular, the
presby teenagers (10s) and people in their 20s, 30s, 40s, and ence of sadness exceeds 50% in SNs from 1970-1989 and
60s. from 2020-now. Then, the SNs written from 1970-1989</p>
        <p>Moving on to emotion analysis, the average probability are also characterized by a definite presence of disgust
of SNs conveying sadness (M = 0.48, SD = 0.37 ) is four (22.2%). In line with the sentiment analysis results, SNs
times higher than the average probability of conveying from 1990-1999 contain the lowest presence of sadness
anger (M = 0.12, SD = 0.22), fear (M = 0.12, SD = 0.21), (40.8%) and generally the lowest presence of negative
and neutrality (M = 0.12, SD = 0.18). Sadness (53.9%) is, emotions overall, compared to other periods. SNs written
indeed, the dominant emotion in the corpus, followed by before 1950 display the highest presence of fear (17.3%)
neutrality (13.2%), anger (11.8%), and fear (7.9%). This is in the corpus, although sadness still remains the most
determined by identifying the highest probability among prevalent emotion in this period.
the seven emotion classes for each individual SN. From Figure 10, we can identify a clear disparity
be</p>
        <p>We can pinpoint some interesting outcomes from the tween the emotions transmitted by female and male
vicanalysis of emotions across genres and sociolinguistic tims. Sadness appears more frequently in SNs written by
factors. As concerns genre, Figure 9 depicts an obvious females (53.1%) compared to males (41.5%). Additionally,
diference between letters and online posts. On the one anger is more prevalent in SNs written by males (17.1%),
hand, sadness is more frequent in online posts (59.2%) ranking as their second most common emotion (after
sadness).
ings (e.g., anguish and pain) and greater polarized
emotion words and swearwords.</p>
        <p>Subjectivity analysis revealed that SNs tended to be
more subjective than objective, irrespective of the genre.</p>
        <p>Some diferences based on addressees were identified in
the corpus; for example, SNs directed toward close
relationships (i.e., life-partners and family) showed higher
subjectivity scores, suggesting a more profound and
personal style, compared to those directed toward the broader
(internet) public.</p>
        <p>As far as sentiment analysis is concerned, negative
sentiment was dominant in the corpus (i.e. three times more
frequent than neutral or positive sentiment), especially
Figure 10: Emotions as a function of gender in online posts. Then, the analysis of emotions revealed
that sadness was the main emotion in the corpus. This
evident presence of sadness and negative sentiment reflects
the complex emotional challenges and inner struggles
that victims experienced at the time they wrote their SNs.</p>
        <p>Although sadness was the most common emotion in both
letters and online posts, it occurred more frequently in
the latter text type. Also, letters tended to convey more
positive emotions (e.g., joy) more frequently than online
posts. Finally, the analysis revealed that sadness was
more common in the SNs written by female victims and
by teenagers.</p>
        <p>All in all, our results reveal that the content, discourse,
and emotional expression in SNs vary as a function of
genre, sociolinguistic factors, and relationship dynamics.</p>
        <p>These diferences uncover the need of taking into
acFigure 11: Emotions as a function of age group count specific social, demographic, and cultural variables
when designing and implementing suicide prevention
and intervention strategies. In this sense, we believe that</p>
        <p>Although Figure 11 illustrates a complex distribution of corpus-based and NLP research on SNs can contribute to
emotions across the age groups of the victims, some pat- the improvement of these personalized strategies.
terns still emerge. Sadness is the most common emotion
in the SNs of all age groups except for those written by Acknowledgments
people in their 30s, where neutrality prevails (36.2%).
Interestingly, teenagers express the lowest neutrality (3.4%) The research presented in this paper was conducted while
and the highest sadness (60.1%). Additionally, fear is C. R. Combei benefited from support provided by the
prominent among SNs written by people over 70 years project "PON Ricerca e Innovazione 2014–2020 - Linea
old (31.8%), making it the second most frequent emotion Innovazione (D.M. 1062/2021)".
for this age group. Fear is also the second most common
emotion for SNs written by teenagers (14.6%).</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>6. Conclusions</title>
      <p>This mixed-methods study analysed the content and style
of 76 SNs written over the course of a century, using
genre, several sociolinguistic factors, and relationship
dynamics as covariates. First of all, three main topics
emerged from our corpus, that we labelled as
Explanations, Anguish, and Connectedness. Looking at the
diferences in topical prevalence between the two text types,
we observed that online posts displayed less private
feelThe Journal of Abnormal and Social Psychology 59 USA, 2008, pp. 1556–1560.</p>
      <p>(1959) 58–67. [18] M. Baroni, S. Bernardini, Bootcat: Bootstrapping
[4] B. Samraj, J. M. Gawron, The suicide note as a corpora and terms from the web., in: Proceedings
genre: Implications for genre theory, Journal of of the fourth international conference on language
English for Academic Purposes 19 (2015) 88–101. resoiurces and evaluation, Lisbon, Portugal, 26-28
[5] A. E. Jaafar, H. A.-S. Jasim, A corpus-based stylis- May 2004, 2004, pp. 1313–1316.
tic analysis of online suicide notes retrieved from [19] D. M. Blei, Probabilistic topic models,
Communicareddit, Cogent Arts &amp; Humanities 9 (2022) 2047434. tions of the ACM 55 (2012) 77–84.
[6] J. Pestian, H. Nasrallah, P. Matykiewicz, A. Ben- [20] J. Chang, S. Gerrish, C. Wang, J. Boyd-Graber,
nett, A. Leenaars, Suicide note classification using D. Blei, Reading tea leaves: How humans
internatural language processing: A content analysis, pret topic models, Advances in neural information
Biomedical informatics insights 3 (2010) BII–S4706. processing systems 22 (2009) 288–296.
[7] S. Ghosh, A. Ekbal, P. Bhattacharyya, Cease, a cor- [21] M. E. Roberts, B. M. Stewart, D. Tingley, C. Lucas,
pus of emotion annotated suicide notes in english, J. Leder-Luis, S. K. Gadarian, B. Albertson, D. G.
in: Proceedings of the twelfth interantional confer- Rand, Structural topic models for open-ended
surence on language resoiurces and evaluation, 2020, vey responses, American journal of political science
pp. 1618–1626. 58 (2014) 1064–1082.
[8] A. M. Schoene, A. Turner, G. R. D. Mel, N. Dethlefs, [22] R Core Team, R: A Language and Environment for
Hierarchical multiscale recurrent neural networks Statistical Computing, R Foundation for Statistical
for detecting suicide notes, IEEE Transactions on Computing, Vienna, Austria, 2023. URL: https://
Afective Computing 14 (2021) 153–164. www.R-project.org/.
[9] M. Chatterjee, P. Kumar, P. Samanta, D. Sarkar, Sui- [23] A. Kilgarrif, V. Baisa, J. Bušta, M. Jakubíček,
cide ideation detection from online social media: A V. Kovář, J. Michelfeit, P. Rychly`, V. Suchomel, The
multi-modal feature based technique, International sketch engine: ten years on, Lexicography 1 (2014)
Journal of Information Management Data Insights 7–36.</p>
      <p>2 (2022) 100103. [24] J. Hlaváčová, P. Rychly`, Dispersion of words in
[10] T. Zhang, A. M. Schoene, S. Ananiadou, Automatic a language corpus, in: Text, Speech and
Diaidentification of suicide notes with a transformer- logue: Second International Workshop, TSD’99
based deep learning model, Internet interventions Plzen, Czech Republic, September 13–17, 1999
Pro25 (2021) 100422. ceedings 2, Springer, 1999, pp. 321–324.
[11] M. Ioannou, A. Debowska, Genuine and simulated [25] A. Schuck, R. Calati, S. Barzilay, S. Bloch-Elkouby,
suicide notes: An analysis of content, Forensic I. Galynker, Suicide crisis syndrome: A review of
science international 245 (2014) 151–160. supporting evidence for a new suicide-specific
di[12] E. T. Sudjana, N. Fitri, Kurt cobain’s suicide note agnosis, Behavioral Sciences and the Law 37 (2019)
case: Forensic linguistic profiling analysis, Inter- 223–239.
national Journal of Criminology and Sociological [26] R. Feldman, Techniques and applications for
senTheory 6 (2013) 217–227. timent analysis, Communications of the ACM 56
[13] N. Malini, V. Tan, Forensic linguistics analysis of (2013) 82–89.</p>
      <p>virginia woolf’s suicide notes, International Journal [27] C. R. Combei, A. Luporini, Sentiment and emotion
of Education 9 (2016) 53–58. analysis meet appraisal: A corpus study of tweets
[14] M. E. Roberts, B. M. Stewart, D. Tingley, Stm: An related to the COVID-19 pandemic, Rassegna
ItalR package for structural topic models, Journal of iana di Linguistica Applicata 53 (2021) 115–136.
statistical software 91 (2019) 1–40. [28] J. Camacho-Collados, K. Rezaee, T. Riahi,
[15] A. Montoyo, P. Martínez-Barco, A. Balahur, Sub- A. Ushio, D. Loureiro, D. Antypas, J. Boisson,
jectivity and sentiment analysis: An overview of L. Espinosa Anke, F. Liu, E. Martínez Cámara,
the current state of the area and envisaged devel- TweetNLP: Cutting-edge natural language
proopments, Decision Support Systems 53 (2012) 675– cessing for social media, in: Proceedings of the
679. 2022 Conference on Empirical Methods in Natural
[16] L. Bing, Sentiment Analysis: Mining Opinions, Sen- Language Processing: System Demonstrations,
timents, and Emotions, Cambridge University Press, Association for Computational Linguistics, Abu
Cambridge, 2015. Dhabi, UAE, 2022, pp. 38–49.
[17] C. Strapparava, R. Mihalcea, Learning to identify [29] D. Loureiro, F. Barbieri, L. Neves, L. Espinosa Anke,
emotions in text, in: Proceedings of the 2008 ACM J. Camacho-collados, TimeLMs: Diachronic
lanSymposium on Applied Computing, SAC ’08, Asso- guage models from Twitter, in: Proceedings of the
ciation for Computing Machinery, New York, NY, 60th Annual Meeting of the Association for
Com</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>J. J.</given-names>
            <surname>Shapero</surname>
          </string-name>
          ,
          <article-title>The language of suicide notes</article-title>
          ,
          <source>Ph.D. thesis</source>
          , University of Birmingham,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>G.</given-names>
            <surname>Tellari</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Zanchi</surname>
          </string-name>
          ,
          <article-title>Il suicidio di universitari nei media italiani: Uno studio basato su corpus</article-title>
          , in: S. Matiola, M. Milicevic Petrovic (Eds.), CLUB - Working Papers in Linguistics, volume
          <volume>8</volume>
          ,
          <string-name>
            <given-names>AMS</given-names>
            <surname>Acta</surname>
          </string-name>
          <string-name>
            <surname>AlmaDL</surname>
          </string-name>
          , Bologna,
          <year>2024</year>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>20</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>C. E.</given-names>
            <surname>Osgood</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E. G.</given-names>
            <surname>Walker</surname>
          </string-name>
          ,
          <article-title>Motivation and language behavior: a content analysis of suicide notes</article-title>
          ., putational Linguistics:
          <article-title>System Demonstrations, Association for Computational Linguistics</article-title>
          , Dublin, Ireland,
          <year>2022</year>
          , pp.
          <fpage>251</fpage>
          -
          <lpage>260</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [30]
          <string-name>
            <given-names>F.</given-names>
            <surname>Barbieri</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Camacho-Collados</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L. Espinosa</given-names>
            <surname>Anke</surname>
          </string-name>
          , L. Neves,
          <article-title>TweetEval: Unified benchmark and comparative evaluation for tweet classification</article-title>
          , in: T. Cohn,
          <string-name>
            <given-names>Y.</given-names>
            <surname>He</surname>
          </string-name>
          , Y. Liu (Eds.),
          <source>Findings of the Association for Computational Linguistics: EMNLP</source>
          <year>2020</year>
          ,
          <article-title>Association for Computational Linguistics</article-title>
          , Online,
          <year>2020</year>
          , pp.
          <fpage>1644</fpage>
          -
          <lpage>1650</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [31]
          <string-name>
            <given-names>J.</given-names>
            <surname>Hartmann</surname>
          </string-name>
          , Emotion English DistilRoBERTabase, https://huggingface.co/j-hartmann/emotionenglish-distilroberta-base/,
          <year>2022</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [32]
          <string-name>
            <given-names>P.</given-names>
            <surname>Ekman</surname>
          </string-name>
          ,
          <article-title>Basic emotions</article-title>
          , in: T. Dalgleish, T. Power (Eds.),
          <source>The Handbook of Cognition and Emotion</source>
          , John Wiley &amp; Sons, Ltd, Sussex, U.K.,
          <year>1999</year>
          , pp.
          <fpage>45</fpage>
          -
          <lpage>60</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>