<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>ConversationMoC: Encoding Conversational Dynamics using Multiplex Network for Identifying Moment of Change in Mood and Mental Health Classification⋆</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Loitongbam Gyanendro Singh</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Stuart E. Middleton</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Tayyaba Azim</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Elena Nichele</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Pinyi Lyu</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Santiago De Ossorno Garcia</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Department of Management, College of Arts Social Sciences &amp; Humanities, University of Lincoln</institution>
          ,
          <country country="UK">UK</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>School of Electronics and Computer Science, University of Southampton</institution>
          ,
          <country country="UK">UK</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Universidad Complutense de Madrid</institution>
          ,
          <country country="ES">Spain</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Understanding mental health conversation dynamics is crucial, yet prior studies often overlooked the intricate interplay of social interactions. This paper introduces a unique conversation-level dataset and investigates the impact of conversational context in detecting Moments of Change (MoC) in individual emotions and classifying Mental Health (MH) topics in discourse. In this study, we diferentiate between analyzing individual posts and studying entire conversations, using sequential and graph-based models to encode the complex conversation dynamics. Further, we incorporate emotion and sentiment dynamics with social interactions using a graph multiplex model driven by Graph Convolution Networks (GCN). Comparative evaluations consistently highlight the enhanced performance of the multiplex network, especially when combining reply, emotion, and sentiment network layers. This underscores the importance of understanding the intricate interplay between social interactions, emotional expressions, and sentiment patterns in conversations, especially within online mental health discussions. We are sharing our new dataset (ConversationMoC) and codes with the broader research community to facilitate further research1.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Mental health conversation dynamics</kwd>
        <kwd>Moments of Change (MoC)</kwd>
        <kwd>Emotional expressions</kwd>
        <kwd>Graph Convolution Networks (GCN)</kwd>
        <kwd>Multiplex network</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>has yielded superior outcomes in tasks such as ques- • A new Reddit dataset, augmented with
convertion answering [13, 14] and personalized recommenda- sational context and carefully annotated for use
tion [15, 16]. Additionally, graph-based representations in Moments of Change (MoC) and Mental Health
have proven beneficial in other conversation tasks such (MH) discourse classification, is now publicly
as dialogue act recognition [11, 17], intent detection [18], available for the first time. This dataset
introand topic modeling [19], contributing to improved perfor- duces an important development in identifying
mance across these domains. These findings highlight the MoC using a valence and arousal space.
potential of utilizing network structures to improve the • This study extensively compares suitable
baseunderstanding and performance of diverse conversation- line models over the new MoC dataset. Further,
related tasks. to encode the complex conversation dynamics, a</p>
      <p>Inspired by prior research, this study explores the po- multiplex network structure is introduced,
captential of leveraging social and meta-interaction informa- turing the intricate interplay between social
intion for mental health tasks, including identifying MoC teractions, emotional expressions, and sentiment
in an individual’s mood and classifying MH discourse. patterns within conversations, emphasizing the
Notably, no existing datasets specifically address MoC uniqueness of this research.
detection with a full conversation context, underscoring • A comprehensive exploration of the multiplex
the novelty and importance of this study. To facilitate layers, determining the significance of each layer
our investigation, we have curated a new dataset com- for conversational MoC and MH classification
prising 967 conversations covering 15 MH topics sourced tasks.
from the Reddit social media platform (explained in
Section 3.2). This dataset ofers insights into the intricate The rest of the paper is organized as follows: Section 2
interplay between language use and social interactions. provides an overview of related work. Section 3 discusses
Further, to encode the complex conversation dynamics, in detail the dataset curation. Section 4 discusses the
we utilize a multiplex network representation of conver- experiment designs. Section 5 presents the experimental
sations, wherein each layer captures diferent aspects of results and discussion, and finally, the study concludes
the conversation, such as emotion, sentiment, and reply in Section 6.
interactions (refer Section 3 for detailed discussion). By
introducing a novel dataset and highlighting the signifi- 2. Related studies
cance of representing conversation context via multiplex
networks, this study aims to uncover hidden emotional
dynamics and understand the impact of social interac- 2.1. Moment of change detection
tions on individual mood shifts. Throughout the paper, Various studies have investigated the connection between
the individual who starts the conversation is referred to changes in user language on social media platforms and
as the target user, and other participants as non-target their mental health, specifically identifying significant
users. transitions or shifts in sentiment and/or emotion states.</p>
      <p>
        A comprehensive evaluation is performed to assess Work includes exploring language changes to establish a
the efectiveness of the proposed study in detecting MoC foundation for detecting the MoC by analyzing
sequenand identifying types of MH topics in discourse. Using tial textual content [20, 21]. The CLPsych Shared Task
suitable sequential and graph-based baseline models, the 2022 [
        <xref ref-type="bibr" rid="ref5">22, 5, 23</xref>
        ] further emphasized detecting MoC and
significance of incorporating conversation is evaluated User Mental Health Risk identification tasks, where
inby comparing the model’s performance with and with- corporating pre-trained BERT-based models with
BiLout the conversation’s contextual information. Further, STM frameworks [
        <xref ref-type="bibr" rid="ref6">6, 24</xref>
        ] showed promising performance
the significance of incorporating multiplex networks is on a TalkLife dataset without full conversation context
thoroughly explored by comparing the model’s perfor- (i.e. target users only). The above studies have
exammance for each multiplex layer. The experimental results ined changes in language patterns of target users to infer
reveal the substantial benefits of leveraging conversa- shifts in psychological well-being, stress levels, and
emotion contextual information for MoC detection, ofering tional states, providing insights into the dynamics of
a more accurate understanding of the target user’s mood mood change over time. However, the conversation of
shift and MH classification tasks. Additionally, the in- other users with the target users is overlooked in the
clusion of conversation multiplex network information, above studies.
particularly the reply and sentiment graphs, significantly
enhances the performance of the proposed model, as 2.2. Mental health disorder classification
demonstrated by the results in Table 2. In summary, this
study has the following contributions:
Numerous studies have explored the utilization of
selfreporting posts on social media platforms like Reddit
and Twitter as valuable resources for detecting mental Escalation (IE)
health (MH) disorders [
        <xref ref-type="bibr" rid="ref1 ref4">1, 4, 25</xref>
        ]. Distant supervision S w(ISitc)h AROUSAL (active) S w(ISitc)h
has emerged as a popular approach, thanks to its
costefectiveness and ability to capture the rich expressive
dynamics of MH disorders. Commonly studied disor- Anger Joy
ders include schizophrenia, bipolar disorder, depression, Escalation VALENCE Escalation
anxiety, suicide, eating disorders, and Post-Traumatic (IE) (negative) (positive) (IE)
Stress Disorder (PTSD). Previous studies have employed Sad Optimism
n-gram feature engineering methods within a multitask
learning framework [26] to classify each MH disorder as
a separate task, while others treat all disorders as a single Switch (IS) (passive) Switch (IS)
classification task [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. Recent approaches have lever- Escalation (IE)
aged fine-tuning of pre-trained BERT models [
        <xref ref-type="bibr" rid="ref3">27, 3</xref>
        ] and Figure 2: 2D Valency Arousal Space depicting the moment of
prompt-based masked language models [28, 29] for MH change in mood reflected through user posts. The diagonal
classification task. However, these studies have primarily shift represents Switch ( IS), while the horizontal or vertical
focused on classifying MH disorders based solely on the shift represents Escalation ( IE).
target user’s posts. In contrast to the previous works that
focused solely on a target user’s sequence of posts, this
study underscores the signicfiance of considering con- 3.1. Mental Health Subreddits Selection
textual conversation information. By incorporating the
contextual information, we aim to gain a more compre- In this study, our data collection eforts were directed
hensive understanding of the conversation to accurately towards 15 distinct mental health (MH) subreddits4, each
identify Moments of Change (MoC) and classify Mental delving into a wide spectrum of MH topics. The selection
Health (MH) disorder topics in a target user’s discourse. of these MH topics was meticulously guided by prior
re
      </p>
      <p>
        In a similar direction concerning mental health-related search, particularly a study conducted by Low et al. [
        <xref ref-type="bibr" rid="ref10">30</xref>
        ].
tasks, [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] highlights the significance of comprehending This seminal research ofered valuable insights into the
conversational dynamics when identifying posts indicat- prevalence and importance of diverse themes in online
ing suicidal ideation. Their work primarily centers on mental health (MH) discussions. However, it did not
addetermining whether a post contains suicide ideation dress the specific task of detecting Moments of Change
information. In contrast, our approach revolves around (MoC), laying the groundwork for our dataset curation. It
tracking the temporal evolution of a target user’s posts to is important to note that our dataset difers in terms of its
identify the MoC of the target user’s moods. Furthermore, time frame, spanning from November 1, 2018, to
Novemthis study exploits multiplex graphs capturing various ber 1, 2019, thus ofering a distinct temporal context. By
conversation aspects, such as social interactions, emo- encompassing these diverse MH topics, we aimed to
captional expressions, and sentiment patterns, to provide a ture a comprehensive and representative snapshot of MH
more nuanced understanding of the conversation dynam- discussions within various online communities. The 15
ics. This insight highlights the distinction and depth of MH topics are listed in Table 1.
our contributions in the context of conversational
analysis and MH detection tasks. 3.2. Data Collection
      </p>
    </sec>
    <sec id="sec-2">
      <title>3. Dataset overview</title>
      <p>This section presents a detailed overview of the dataset
utilized in our study, which has been collected from the
Reddit social media platform using the Pushshift API3.
This dataset has been curated to facilitate research in the
ifeld of classifying mental health discourse and temporal
moment of change (MoC) detection. For ease of
reference, we named this dataset as ConversationMoC. In the
following subsections, we will delve into the dataset’s
composition, the data collection process, and the unique
attributes that make it a valuable resource for
investigating mental health-related conversational dynamics.</p>
      <sec id="sec-2-1">
        <title>3https://github.com/pushshift/api</title>
        <p>We collected data focusing on the posts that initiated
conversations to compile our dataset5. Each user’s timeline
constitutes a chronological record of their conversations,
encompassing their posts and replies from other users.
In this study, we use the term post to refer to both user
comments and the initiating posts. To ensure meaningful
and comprehensive data, we specifically selected
conversations in which the target user contributed at least two
posts, allowing us to examine the conversation
dynamics efectively. Table 1 presents the dataset distribution,
which consists of 963 target users participating in 967
conversations – 11,841 users contributed 28,659 posts,
4A subreddit is a thematic community on Reddit that focuses on specific topics.
5We hypothesize that the target user is either sufering or interested to know about
the subject.</p>
        <sec id="sec-2-1-1">
          <title>Addiction</title>
        </sec>
        <sec id="sec-2-1-2">
          <title>ADHD</title>
        </sec>
        <sec id="sec-2-1-3">
          <title>Alcoholism</title>
        </sec>
        <sec id="sec-2-1-4">
          <title>Anxiety</title>
        </sec>
        <sec id="sec-2-1-5">
          <title>Autism</title>
        </sec>
        <sec id="sec-2-1-6">
          <title>Bipolar BPD</title>
        </sec>
        <sec id="sec-2-1-7">
          <title>Depression</title>
        </sec>
        <sec id="sec-2-1-8">
          <title>Eating Disorder</title>
        </sec>
        <sec id="sec-2-1-9">
          <title>Health Anxiety</title>
        </sec>
        <sec id="sec-2-1-10">
          <title>Loneliness</title>
        </sec>
        <sec id="sec-2-1-11">
          <title>PTSD</title>
        </sec>
        <sec id="sec-2-1-12">
          <title>Schizophrenia</title>
        </sec>
        <sec id="sec-2-1-13">
          <title>Social Anxiety</title>
        </sec>
        <sec id="sec-2-1-14">
          <title>Suicide</title>
        </sec>
        <sec id="sec-2-1-15">
          <title>Total</title>
        </sec>
        <sec id="sec-2-1-16">
          <title>Unique</title>
        </sec>
        <sec id="sec-2-1-17">
          <title>Convs #Posts (#Users)</title>
          <p>
            Target Users
Avg IS
posts
/Convs
with 9,221 posts from the 963 target users. levels change. When the emotion remains unchanged or
neutral throughout a conversation, it is labeled O. The
3.3. Data Annotation use of VA space allows a more structured assessment of
IS and IE and is less subjective than relying on simple
Three annotators with educational backgrounds in Psy- annotator label judgments of mood change as in [
            <xref ref-type="bibr" rid="ref5">22, 5</xref>
            ].
chology and Computer Science were recruited to anno- The annotators achieved a near-perfect agreement,
tate the MoC in the new dataset. They were given a with a mean Cohen’s Kappa score6 of 0.808 across all
detailed briefing on the task, which involved determin- 15 subreddits. Conflicts in annotations were resolved
ing the mood or emotion expressed in each sentence of through a majority voting criterion, with the final
manthe target user’s posts. The annotators identified a domi- ual label determined by one annotator, who acted as the
nant mood for each user’s posts (anger, sad, joy, optimism, chairperson, having a deeper understanding of the
conand neutral), which was the basis for determining MoC text and similarities to other shared tasks. From Table 1,
between consecutive posts. The task is defined as a three- it can be seen that the distribution of annotations for
class classification problem: Switch ( IS), Escalation (IE), IE, IS, and O are highly imbalanced, reflecting the real
and No MoC (O) following the annotation scheme of scenario where emotional switches (IS) are infrequent,
[
            <xref ref-type="bibr" rid="ref5">22, 5</xref>
            ]. IS represents abrupt changes in an individual’s and escalations (IE) occur less frequently than relative
emotional state, while IE signifies the evolving nature stability (O). This distribution aligns with the finding that
of mood changes. O indicate relative stability, i.e., no user posts commonly show stable moods.
noticeable shifts in the user’s mood.
          </p>
          <p>
            The Valence and Arousal (VA) chart (shown in
Figure 2) is considered to annotate IS and IE, representing 4. Methodology
afective states in a continuous numerical VA space.
According to the Circumplex model [
            <xref ref-type="bibr" rid="ref11">31</xref>
            ], transitions in the This study delves into the performance evaluation of
VA space, such as moving from Anger to Sad or Anger to the state-of-the-art sequential and graph-based models
Joy and vice versa, either horizontally or vertically, corre- on the novel ConversationMoC dataset. Additionally, it
spond to Emotional Escalation (IE). Conversely, diagonal explores the potential of leveraging social and
metatransitions, like going from Sad to Joy or Anger to Op- interaction information through a multiplex network
timism and vice versa, indicate Emotional Switch (IS). In structure, where each layer captures distinct aspects of
simpler terms, for escalation, either the level of valence the conversation, including emotion, sentiment, and reply
or arousal remains the same even if the emotion changes.
          </p>
          <p>In contrast, for a switch, both the valence and arousal 6https://en.wikipedia.org/wiki/Cohen’s_kappa</p>
          <p>Post-to-post
Multiplex graph
(A)</p>
          <p>Moment of Change (MoC) classification</p>
          <p>Softmax(Target user posts)</p>
          <p>Mental health</p>
          <p>Discourse
Classification</p>
          <p>Softmax(FlattenedEMB)</p>
          <p>Flattening
Masked non-target user post embedding</p>
          <p>Multi-head attention layer
Conversation encoded</p>
          <p>post embedding
Multiplex graph learning model</p>
          <p>Temporal encoded</p>
          <p>Post embedding
Sequence-to-sequence learning model</p>
          <p>Post embedding</p>
          <p>
            Conversationi posts (P)
4.1. Post embedding
relations. Figure 3 shows an overview of this experimen- target user’s moods, we utilize a Bidirectional Long
Shorttal framework, demonstrating how conversation dynam- Term Memory (BiLSTM) model [
            <xref ref-type="bibr" rid="ref15 ref16">35, 36</xref>
            ] as the
fundamenics are encoded. This can be achieved using a standalone tal component of the sequential representation model.
sequential model, a graph-based model, or a combination The BiLSTM layer processes the input sequence of posts
of both. The following subsections provide an in-depth encoded using of-the-shelf pre-trained models (discussed
exploration of the evaluation framework. in Section 4.1), denoted as  = {1, 2, ..., }, where
each  represents an individual post. Mathematically,
the BiLSTM network is defined as follows:
This study considers the concatenation of the
pretrained embeddings using averaged fastText word em- (1)
bedding [
            <xref ref-type="bibr" rid="ref12">32</xref>
            ], Sentence-BERT (SBERT) [
            <xref ref-type="bibr" rid="ref13">33</xref>
            ], and task- ℎ = [ℎ→, ℎ← ]
specific pre-trained RoBERTa-base models [
            <xref ref-type="bibr" rid="ref14">34</xref>
            ]7 for
semantic representation of individual posts. These pre- where  is the semantic embedding of the post , ℎ→
trained embedding models have been utilized in various and ℎ← represent the hidden states of the forward and
studies [
            <xref ref-type="bibr" rid="ref5 ref6">5, 6, 24</xref>
            ] and demonstrated superior performance backward LSTMs, →− 1 and ←+1 are the previous cell
in the CLPsych2022 shared task [22]. Several prepro- states of the forward and backward LSTMs, and ℎ
repcessing steps were performed before applying the post- resents the temporal enhanced post-embedding, which is
embedding, such as normalizing keywords, anonymizing a concatenation of the hidden states from both the
forusers8, converting to lowercase, and removing URL links. ward and backward LSTMs. The BiLSTM layer processes
the input sequence  sequentially, updating the hidden
4.2. Sequential Representation states ℎ and cell states  at each time step . This
allows the model to capture the sequential information in
To model the sequential progression of posts within a the conversation, capturing the temporal dependencies
conversation and capture temporal dependencies in the between posts and enabling a better understanding of
the user’s mood dynamics over time.
          </p>
          <p>ℎ→ = LSTM→(, ℎ→− 1, →− 1)
ℎ← = LSTM← (, ℎ←+1, ←+1)</p>
        </sec>
      </sec>
      <sec id="sec-2-2">
        <title>7https://huggingface.co/cardifnlp/twitter-roberta-base-sentiment 8Converting original user name to @username</title>
        <p>Conversation input posts multiplex network (A)
Reply
Emotion
Sentiment
Conversation input posts in temporal order (P)
2-layer GCN
Shared weights
2-layer GCN
Shared weights
2-layer GCN
(Temporal Encoded)
Post embedding H(0)</p>
        <p>M
a
x
p
o
o
il
n
g</p>
        <p>H oC
n
v
e
rs
a
it
o
n
e
n
c
o
d
e
d
p
o
s
t
e
m
b
e
d
d
i
n
g
4.3. Multiplex Graph Representation
Social media conversations are inherently non-linear,
marked by users responding to earlier and recent posts,
potentially influencing the mood or emotion of future
posts. Figure 4 shows a conversation’s multiplex network
structure representation using a two-layer Graph
Convolutional Network (GCN). This approach captures this
non-linearity by introducing a multiplex network
consisting of reply, sentiment, and emotion network layers.</p>
        <p>Specifically, the</p>
        <p>
          reply layer focuses on linking posts
involved in social interactions between users. The emotion
and sentiment layers are constructed by linking posts
with similar emotions and sentiments, classified using
the pre-trained RoBERTa-based emotion and sentiment
models [
          <xref ref-type="bibr" rid="ref14">34</xref>
          ]. The GCN model efectively encodes the
dependencies between each layer and the social and
metaLet ,  , and  represent the adjacency matrices
of the reply, emotion, and sentiment layers, including
selfloops. Mathematically, the -layer GCN propagation over
the  layers multiplex network can be defined as follows:
where each row of − 1 matrix is the input
postembedding at GCN layer ,  denotes the Rectified
Linear Unit activation function, while  represents the
degree of nodes in the ℎ multiplex layer.  () is the
weight matrix at layer , which is learned during the
training process. The weights  () are shared across all
layers. By updating the shared weight matrix  ()
during the training process, the GCN model assigns diferent
importance to diferent layers of the multiplex network.
        </p>
        <p>
          Further, by applying max pooling, the GCN allows the
network to capture the most prominent information from
each layer, potentially emphasizing important features
contributing to the overall task. The resulting node
feature matrix () represents the enhanced post-embedding
of the -layer GCN model. In this study, we consider a
2-layer GCN model, where the input (0) represents the
temporal enhanced post-embedding output from the
BiLSTM network and the output (2) represents the final
enhanced post-embedding (H), capturing both temporal
and multiplex network of social and meta-interaction of
the conversation.
4.4. Multitask classification
The evaluation framework tackles two tasks
simultaneously: Moment of Change (MoC) detection and Mental
Health (MH) classification. MoC detection focuses on
while MH classification operates at the conversation level
to determine the specific MH topics in discourse. To
improve the MH classification task, we add a multi-head
selfattention layer [
          <xref ref-type="bibr" rid="ref17">37</xref>
          ] over the enhanced post-embedding
 = softmax(b * H)
 = softmax( (H))
        </p>
        <p>(3)
where b is a Boolean vector to mask the non-target users’
posts from H.
4.5. Loss functions
The evaluation framework considers the entire
conversations to classify the Moments of Change (MoC) of the
interaction across various aspects of the conversation. identifying mood shifts of the target user at the post level,
() = max
︂( {︁
︁(
− 1/2− 1/2(− 1) ())︁}︁  )︂ (H), resulting in an attention-weighted encoded
represen</p>
        <p>tation (H). Mathematically, the classification tasks
=1
(2)</p>
        <p>
          can be defined as:
 
1 ∑︁ ∑︁ (︁   · (1 −  ) · T(1) · log( ))︁
ℒ = −  =1 =1
ℒ = −
target user’s mood, it is essential to mask the posts of
non-target users. To train the model for the MoC
detection task, we apply the Focal Loss Function [
          <xref ref-type="bibr" rid="ref18">38</xref>
          ],
originally designed for object detection tasks to address the
imbalanced class distribution. We use the traditional
categorical cross-entropy loss function (CE) for the MH
classification task. The loss functions for each task can
be mathematically defined as:
posts. This method serves as the baseline model for
evaluating the performance of the evaluation framework.
        </p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>5. Results and discussion</title>
      <p>5.1. Detection of Moment of change
This section evaluates the performance of the
considered baseline models on the ConversationMoC dataset.</p>
      <p>Initially, we evaluate these models using two input
scenarios: (i) using only the target user’s posts (TU ) and (ii)
 utilizing the entire conversation (All). Notably, as the TU
∑︁ T(2) · log( ) input lacks social interactions, models like GCN and
BiL=1 STM+GCN are not evaluated in this context. Further, we
(4) conduct an extensive analysis to understand the impact</p>
      <p>of diferent layers within the multiplex network on the
where  and  represent the number of posts and MoC downstream tasks. The experimental results for the MoC
class labels in the conversation,   represents the weight detection task, achieved through 10-fold cross-validation,
factor for the MoC class , and  represents the focusing are presented in Table 2. This table includes the mean
parameter to control the rate at which the loss decreases F1-scores for each class (IE, IS, O) as well as the macro
for well-classified examples. For the MoC classification F1-score, providing a comprehensive view of the
overtask, T(1) represents the true MoC label one-hot vector all performance. From the table, it is observed that the
of the target user post i in the conversation. While T(2) BiLSTM+GCN model consistently outperforms its
stanrepresents the true label of the conversation MH topic. dalone counterparts. In particular, the BiLSTM+GCN
models, when incorporating the multiplex graph input
4.6. Comparision of model variants with the Emotion, Sentiment, and Reply (ESR) layers,
exhibit the highest macro F1-scores, achieving 0.422 in the
The MoC and MH classification tasks can be evaluated single-task setup and 0.438 in the multitask setup. An
as single or multitask setups. Moreover, the conversation intriguing observation is that the performance of specific
dynamics can be encoded in both setups using a stan- models significantly deviates from the average in a few
dalone BiLSTM model, GCN model, or a combination folds, leading to a standard deviation of approximately
of both (BiLSTM+GCN). To assess the impact of con- ± 0.02. For a detailed view of these results, please refer
versation context, we compare two input scenarios: (i) to the boxplot presented in Appendix Figure 6, which
TU, which encompasses solely the target user’s sequence visualizes the F1-score performances of the multitask
of posts, and (ii) All, which encompasses the sequence models across all folds. These findings underscore the
efof posts interacting with the target user’s posts in the fectiveness and consistency of the proposed framework,
conversation. Based on the input type, we evaluate the validating its superior performance in detecting MoC
considered models (BiLSTM, GCN, BiLSTM+GCN) over across mental health-related tasks.
the MoC dataset using the pretrained post-embedding The results are evident; incorporating an entire
conver(discussed in Section 4.1). Hyperparameter details are in sation context notably improves the performance of MoC
Appendix Section A.2. detection models compared to using only the target user
posts. Furthermore, the multitask setup consistently
out4.6.1. Heuristic model for MoC detection performs the single-task setup9. Delving into the
performance across diferent classes reveals intriguing insights.</p>
      <p>
        The GCN model, in the multitask setup, emerges as the
best model, achieving an F1-score of 0.287 for the
escalation (IE) class. On the other hand, for the switch (IS) class,
the BiLSTM+GCN model achieves the best performance,
with an F1-score of 0.169. The single-task BiLSTM model,
which exclusively relies on the posts of the target user,
achieves the highest F1-score of 0.906 for the No MoC (O)
class. It suggests that the posts from the target user alone
Inspired by the Circumplex model [
        <xref ref-type="bibr" rid="ref11">31</xref>
        ], we design a
heuristic method for detecting Moments of Change (MoC)
in the target user’s posts. We employ a pre-trained
RoBERTa emotion classifier [
        <xref ref-type="bibr" rid="ref14">34</xref>
        ] to classify the target
user’s posts. This model predicts four primary emotion
classes – anger, sad, joy, and optimism. It assigns each
class confidence score ( t). If a post doesn’t meet the
minimum confidence threshold ( t &gt;= 0.7) for any of the four
emotions considered, it is labeled as neutral. Further,
using the Valence-Arousal (VA) space, we heuristically
assign the Moments of Change (MoC) in the target user’s
9A boxplot comparison of the models F1-score performances using categorical
crossentropy loss function and Focal loss function is shown in Appendix Figure 6.
contain more informative signals for the O class than the the conversation dynamics, ultimately culminating in
context provided by the conversation. The heuristic MoC enhanced performance. In summary, the results in Table
classification model also achieves an F1-score of 0.164 in 2 highlight the importance of using multiplex networks
classifying the IS class, higher than any single-task mod- and emphasize the pivotal role played by the Reply
netels. This underscores the efectiveness of the pre-trained work in MoC detection. Combining social interactions,
RoBERTa-based emotion classifier. emotional expressions, and sentiment patterns provides
a complete conversation view, allowing the model to
5.1.1. Graph multiplex layers analysis handle the tasks efectively.
      </p>
      <p>To delve deeper into the impact of diferent layers within
the multiplex network, we conducted a comprehensive 5.2. Mental health classification
performance analysis of the BiLSTM+GCN model, as Figure 5 presents a bar chart illustrating the performance
detailed in Table 2. The results reveal that the model of various models in classifying mental health (MH)
disperforms better when leveraging the multiplex networks course. Rather than relying on traditional topic
modthan relying on individual networks. Significantly, when eling techniques, we directly categorize the MH topics
we examine the performance of the BiLSTM+GCN model discussed within the conversations using the models
conacross the respective graphs, the Reply graph consistently sidered in this study. The evaluation includes single-task
outperforms the Emotion and Sentiment graphs. This and multitask setups, using the categorical cross-entropy
suggests that social interactions provide more useful in- loss function to train the MH classification task. As seen
formation for the tasks we are interested in. In particular, in Figure 5, the performance is notably superior for the
the Reply graph contains authentic, ground-truth data multitask models compared to their single-task
counterof social interactions. In contrast, the Emotion and Senti- parts. In this study, the most notable performers among
ment graphs are constructed based on the emotion and multitask models are the BiLSTM (All) and BiLSTM+GCN
sentiment classification of each post using the pretrained (R), both achieving remarkable macro F1-scores of 0.85
RoBERTa classifier, which is susceptible to potential mis- and 0.84, respectively. These results substantiate that
classifications, as evidenced by the performance of the incorporating conversation contextual information
sigheuristic MoC classification model in handling IE and O nificantly enhances the accuracy of MH classification,
classes. Moreover, when incorporating Reply and Senti- particularly when considering only the target user’s posts
ment networks, the model’s performance improved even as the input data. This observation highlights the
substanfurther, achieving the highest 0.470 F1-score. This in- tial contribution of conversation context information for
dicates that the Reply network is practical in capturing enhancing the classification of mental health discourse.
changes in the target user’s mood. The interplay be- Delving deeper into the performance across
individtween users and the presence of emotionally charged ual MH topics, it becomes apparent that the BiLSTM
(sentimental) conversations significantly impacts MoC model, incorporating All posts, excels in 8 MH classes,
detection. By incorporating these additional layers, the while the BiLSTM+GCN (Reply) model leads in 7 MH
model attains a more comprehensive understanding of classes (results detailed in Appendix Table 5). These
re(a) Single-task
(b) Multitask
sults underscore the importance of conversation context
information, with both models demonstrating robust
performance across various MH topics. In summary, these
ifndings emphasize the advantages of multitask
models and highlight that integrating conversation context
along with the Reply network significantly enhances the
accuracy of MH classification within conversations. The
BiLSTM+GCN (All) model emerges as a standout
performer, achieving high performance in eight MH
categories within this study.</p>
    </sec>
    <sec id="sec-4">
      <title>6. Conclusion</title>
    </sec>
    <sec id="sec-5">
      <title>7. Ethical Statement</title>
      <p>Ethical approval for this study was obtained from the
University of Southampton ethics board (submission
reference ERGO/FEPS/64959.A1). The research involves
the analysis of personal data sourced from the social
media platform Reddit. To ensure compliance with
ethical guidelines and regulations, we have adhered to the
Reddit platform API’s terms and conditions, and our
annotated dataset is shared with Reddit IDs only so other
researchers can download the original Reddit posts and
metadata directly from Reddit. During the annotation
process, the annotators were informed about the
potential risks of encountering disturbing content. They were
encouraged to take regular breaks and time-outs from
their annotation work to mitigate emotional overload.</p>
      <p>Additionally, a clinically trained psychologist has been
actively advising the team to provide expertise and
guidance throughout the project. A comprehensive risk
assessment has been conducted to identify and address any
potential risks associated with this task. Our
commitment to ethical considerations and the well-being of the
annotators underscores our commitment to conducting
responsible and sensitive research in the field of mental
health analysis.</p>
      <p>This study introduces a novel publicly accessible dataset
(ConversationMoC) tailored to identify the Moments of
Change (MoC) and classify Mental Health (MH) discourse
within conversational settings. The importance of
incorporating conversation information to identify MoC and
classify MH discourse is investigated using a combination
of BiLSTM and GCN models in single-task and multitask
setups. The experimental results evidently show the
significance of incorporating conversation information to
identify MoC and classify MH discourse. Further,
encoding the intricate social interactions, emotional dynamics,
and sentiment patterns through multiplex network
structure enhances classification performances. More
specifically, the Reply network emphasizes the significance of 8. Limitations
social interactions and user engagement. Additionally,
when combined with the Sentiment and Emotion net- In this study, there are few limitations that warrant
conworks, the classification performance further improves, sideration. Firstly, our findings are derived from a single
underscoring the influence of emotional conversations Reddit dataset. While we envision the potential for our
and overall sentiment. The multiplex networks repre- models to generalize well to analogous conversational
sent an exciting new direction for future conversational datasets with a similar social context graph, we have
analysis and mental health detection research. yet to conduct experiments on problem datasets beyond
Reddit. This limitation arises due to the unavailability of
publicly annotated datasets for MoC in this specific
domain, underscoring the significance of our contribution
in providing a new publicly accessible MoC dataset,
ConversationMoC, for prospective research. Additionally, our
study does not explore the performance of more recent
and larger language models (LLMs) like OpenAI’s
GPT3/4, Meta’s LLaMa, Stanford’s Alpaca, and Berkeley’s
Gorilla models. While we anticipate potential improvements
in performance by leveraging these advanced models,
experimental validation of this hypothesis remains
pending. Furthermore, from the perspective of the evaluation
framework, several limitations and potential solutions to
mitigate these challenges are highlighted:
• Contextual Understanding in Short Conversations:</p>
      <p>Acknowledging that short conversations with
limited posts may pose challenges in contextual
understanding, integrating LLMs can alleviate this
issue by capturing a broader context.
• Semantic Consistency in Dynamic Conversations:</p>
      <p>Dynamic conversations with rapid emotional
shifts due to longer conversations (e.g., 5 posts +
50 replies) present hurdles in maintaining
semantic consistency. In this scenario, incorporating
an additional attention layer into the framework
could serve to weight the influence of diferent
posts dynamically and replies within a
conversation. Moreover, exploring the integration of
guiding loss functions is suggested. These
functions would guide the model to focus on the
primary conversation topics and emotions, even
amidst swift emotional changes. This combined
approach could enhance the model’s
understanding of key conversation topics, particularly if the
conversation is full of changing emotions and
dynamics.</p>
    </sec>
    <sec id="sec-6">
      <title>9. Future work</title>
      <p>Acknowledging the potential for the conversation
multiplex network encoding framework to apply to various
domains and recognizing the importance of testing it on
diverse datasets, our current investigation faced
limitations due to the scarcity of datasets with similar
characteristics. In the future, we aim to expand our analysis to
encompass a more comprehensive range of conversation
datasets, thereby demonstrating the broader
applicability of our framework beyond the scope of this specific
domain.
10. Acknowledgement
This work was supported by the Natural Environment
Research Council (NE/S015604/1), the Economic and Social
Research Council (ES/V011278/1) and the Engineering
and Physical Sciences Research Council (EP/V00784X/1).</p>
      <p>The authors acknowledge the use of the IRIDIS High
Performance Computing Facility, and associated support
services at the University of Southampton, in the
completion of this work and the highly valuable insights into the
mental health domain from Aynsley Bernard of Kooth
Plc.
ing, Knowledge and Information Systems (2024). 2098–2110.</p>
      <p>doi:10.1007/s10115-023-02053-8. [21] Y. Pruksachatkun, S. R. Pendse, A. Sharma,
Mo[10] L. G. Singh, A. Mitra, S. R. Singh, Sentiment analysis ments of change: Analyzing peer-based cognitive
of tweets using heterogeneous multi-layer network support in online mental health forums, in:
Prorepresentation and embedding, in: Proceedings ceedings of the 2019 CHI conference on human
of the 2020 Conference on Empirical Methods in factors in computing systems, 2019, pp. 1–13.
Natural Language Processing (EMNLP), 2020, pp. [22] A. Tsakalidis, J. Chim, I. M. Bilal, A. Zirikly, D.
Atzil8932–8946. Slonim, F. Nanni, P. Resnik, M. Gaur, K. Roy,
[11] L. Qin, Z. Li, W. Che, M. Ni, T. Liu, Co-gat: A co- B. Inkster, et al., Overview of the clpsych 2022
interactive graph attention network for joint dialog shared task: Capturing moments of change in
lonact recognition and sentiment classification, in: gitudinal user posts, in: Proceedings of the Eighth
Proceedings of the AAAI Conference on Artificial Workshop on Computational Linguistics and
CliniIntelligence, 2021, pp. 13709–13717. cal Psychology, 2022, pp. 184–198.
[12] D. Sheng, D. Wang, Y. Shen, H. Zheng, H. Liu, Sum- [23] A. Hills, A. Tsakalidis, F. Nanni, I. Zachos, M.
Limarize before aggregate: A global-to-local heteroge- akata, Creation and evaluation of timelines for
neous graph inference network for conversational longitudinal user posts, in: Proceedings of the 17th
emotion recognition, in: Proceedings of the 28th Conference of the European Chapter of the
AssociInternational Conference on Computational Lin- ation for Computational Linguistics, 2023, pp. 3773–
guistics, 2020, pp. 4153–4163. 3786.
[13] X. Huang, J. Zhang, D. Li, P. Li, Knowledge graph [24] U. Bayram, L. Benhiba, Emotionally-informed
modembedding based question answering, in: Proceed- els for detecting moments of change and suicide
ings of the twelfth ACM international conference risk levels in longitudinal social media data, in:
on web search and data mining, 2019, pp. 105–113. Proceedings of the Eighth Workshop on
Computa[14] Y. Zhang, H. Dai, Z. Kozareva, A. Smola, L. Song, tional Linguistics and Clinical Psychology, 2022, pp.</p>
      <p>Variational reasoning for question answering with 219–225.
knowledge graph, in: Proceedings of the AAAI [25] G. Coppersmith, M. Dredze, C. Harman, K.
Hollingconference on artificial intelligence, 2018. shead, From adhd to sad: Analyzing the language
[15] C. Gao, W. Lei, X. He, M. de Rijke, T.-S. Chua, of mental health on twitter through self-reported
Advances and challenges in conversational recom- diagnoses, in: Proceedings of the 2nd workshop on
mender systems: A survey, AI Open 2 (2021) 100– computational linguistics and clinical psychology:
126. from linguistic signal to clinical reality, 2015, pp.
[16] Z. Fu, Y. Xian, Y. Zhu, S. Xu, Z. Li, G. De Melo, 1–10.</p>
      <p>Y. Zhang, Hoops: Human-in-the-loop graph rea- [26] A. Benton, M. Mitchell, D. Hovy, Multitask
learnsoning for conversational recommendation, in: Pro- ing for mental health conditions with limited social
ceedings of the 44th International ACM SIGIR Con- media data, in: Proceedings of the 15th
Conferference on Research and Development in Informa- ence of the European Chapter of the Association
tion Retrieval, 2021, pp. 2415–2421. for Computational Linguistics: Volume 1, Long
Pa[17] D. Wang, Z. Li, H. Zheng, Y. Shen, Integrating user pers, 2017, pp. 152–162. URL: https://aclanthology.
history into heterogeneous graph for dialogue act org/E17-1015.
recognition, in: Proceedings of the 28th Interna- [27] S. Ji, T. Zhang, L. Ansari, J. Fu, P. Tiwari, E. Cambria,
tional Conference on Computational Linguistics, Mentalbert: Publicly available pretrained language
2020, pp. 4211–4221. models for mental healthcare, in: Proceedings of
[18] H. Xu, Z. Yuan, K. Zhao, Y. Xu, J. Zou, K. Gao, Gar- the Thirteenth Language Resources and Evaluation
net: A graph attention reasoning network for con- Conference, 2022, pp. 7184–7190.
versation understanding, Knowledge-Based Sys- [28] S. Ji, Towards intention understanding in suicidal
tems 240 (2022) 108055. risk assessment with natural language processing,
[19] L. Yang, F. Wu, J. Gu, C. Wang, X. Cao, D. Jin, Y. Guo, in: Findings of the Association for Computational
Graph attention topic modeling network, in: Pro- Linguistics: EMNLP 2022, 2022, pp. 4028–4038.
ceedings of The Web Conference 2020, 2020, pp. [29] I. Lin, L. Njoo, A. Field, A. Sharma, K. Reinecke,
144–154. T. Althof, Y. Tsvetkov, Gendered mental health
[20] M. De Choudhury, E. Kiciman, M. Dredze, G. Cop- stigma in masked language models, in: Proceedings
persmith, M. Kumar, Discovering shifts to suicidal of the 2022 Conference on Empirical Methods in
ideation from mental health content in social me- Natural Language Processing, Association for
Comdia, in: Proceedings of the 2016 CHI conference putational Linguistics, 2022, pp. 2152–2170. URL:
on human factors in computing systems, 2016, pp. https://aclanthology.org/2022.emnlp-main.139.</p>
    </sec>
    <sec id="sec-7">
      <title>A. Appendix</title>
      <p>A.1. 15 subreddit topics
In this study, we collected data from 15 mental health
subreddits encompassing a wide range of topics. The 15
subreddits are Eating Disorder (r/EDAnonymous),
Addiction (r/addiction), Alcoholism (r/alcoholism), Attention
Deficit Hyperactivity Disorder (ADHD) (r/adhd), Anxiety
Hyperparameters</p>
      <sec id="sec-7-1">
        <title>Optimizer</title>
      </sec>
      <sec id="sec-7-2">
        <title>Learning rate</title>
      </sec>
      <sec id="sec-7-3">
        <title>Training Epochs</title>
      </sec>
      <sec id="sec-7-4">
        <title>Batch size</title>
      </sec>
      <sec id="sec-7-5">
        <title>BiLSTM #Units</title>
      </sec>
      <sec id="sec-7-6">
        <title>Multihead attention layers</title>
        <p>Pretrained model</p>
      </sec>
      <sec id="sec-7-7">
        <title>FastText [32]</title>
        <p>
          Sentence-BERT [
          <xref ref-type="bibr" rid="ref13">33</xref>
          ]
* RoBERTa-base (emoji)
* RoBERTa-base (emotion) [
          <xref ref-type="bibr" rid="ref14">34</xref>
          ]
* RoBERTa-base (hate)
* RoBERTa-base (irony)
* RoBERTa-base (ofensive )
* RoBERTa-base (sentiment)
        </p>
        <p>Value
(r/anxiety), Autism (r/autism), Bipolar Disorder
(r/BipolarReddit), Borderline Personality Disorder (BPD) (r/bpd),
Depression (r/depression), Health Anxiety
(r/healthanxiety), Loneliness (r/lonely), Post-Traumatic Stress
Disorder (PTSD) (r/ptsd), Schizophrenia (r/schizophrenia),
Social Anxiety (r/socialanxiety), and Suicide
(r/SuicideWatch). Considering these diverse mental health topics,
we aimed to capture a comprehensive picture of mental
health discussions in online communities.</p>
        <p>A.2. Hyperparemeters
This study considers several hyperparameters to optimize
the performance of the proposed model for detecting
moments of change and identifying mental health topics
in conversations. The detailed hyperparameter settings,
including the dimensions of the output representations
from pretrained models, are presented in Table 3.
A.3. Moment of change classification
Figure 6 presents boxplots representing the distribution
of F1-scores for the moment of change (MoC)
classification across three classes: IE (escalation), IS (switch), and
O (No MoC), including the macro F1-score. Each
boxplot represents a diferent model considered in this study,
with the x-axis representing the models and the y-axis
representing the F1-scores. The boxplots show the
median (middle line), interquartile range (box), and range of
the scores (whiskers), providing a visual representation
of the performance distribution for MoC classification.</p>
        <p>H
TU
All
E
S
R
ES
ER
SR
ESR</p>
        <p>Model</p>
      </sec>
      <sec id="sec-7-8">
        <title>Heuristic classifier</title>
      </sec>
      <sec id="sec-7-9">
        <title>BiLSTM (TU)</title>
      </sec>
      <sec id="sec-7-10">
        <title>BiLSTM (All)</title>
      </sec>
      <sec id="sec-7-11">
        <title>BiLSTM+GCN (E)</title>
      </sec>
      <sec id="sec-7-12">
        <title>BiLSTM+GCN (S)</title>
      </sec>
      <sec id="sec-7-13">
        <title>BiLSTM+GCN (R)</title>
      </sec>
      <sec id="sec-7-14">
        <title>BiLSTM+GCN (ES)</title>
      </sec>
      <sec id="sec-7-15">
        <title>BiLSTM+GCN (ER)</title>
      </sec>
      <sec id="sec-7-16">
        <title>BiLSTM+GCN (SR)</title>
        <p>Input type</p>
      </sec>
      <sec id="sec-7-17">
        <title>Target user’s posts only</title>
      </sec>
      <sec id="sec-7-18">
        <title>Target user’s posts only</title>
      </sec>
      <sec id="sec-7-19">
        <title>Entire posts in a conversation</title>
      </sec>
      <sec id="sec-7-20">
        <title>Entire posts + Emotion graph</title>
      </sec>
      <sec id="sec-7-21">
        <title>Entire posts + Sentiment graph</title>
      </sec>
      <sec id="sec-7-22">
        <title>Entire posts + Reply graph</title>
        <p>Entire posts + Emotion and Sentiment
multiplex graph</p>
      </sec>
      <sec id="sec-7-23">
        <title>Entire posts + Emotion and Reply multi</title>
        <p>plex graph</p>
      </sec>
      <sec id="sec-7-24">
        <title>Entire posts + Sentiment and Reply mul</title>
        <p>tiplex graph</p>
      </sec>
      <sec id="sec-7-25">
        <title>BiLSTM+GCN (ESR) Entire posts + Emotion, Sentiment, and</title>
      </sec>
      <sec id="sec-7-26">
        <title>Reply multiplex graph</title>
        <p>A.4. Mental Health classification
performing multitask models for each of the 15 individual
mental health categories. The table showcases the
efectiveness of these models in accurately classifying mental
health categories, as indicated by their high F1-scores
achieved through 10-fold cross-validation.
)7 )6 ) ) ) ) ) ) )</p>
        <p>1 6 6 3 1 4 1
s
e
± ± (± (± (± ± ± ± ±
m ( ( ( ( ( ( h
its 99 24 49 32 43 49 07 21 37 lta 37 41 02 92 49 08 58 96 74 co t u
u .4 .8 .2 .2 .9 .1 .4 .3 .2 e .5 .8 .5 .4 .7 .4 .5 .4 .4 a n m
A 0 0 0 0 0 0 0 0 0 H 0 0 0 0 0 0 0 0 0 in se r
e e
r r y
e
s
) ) ) ) ) ) ) ) )
3 9 5 9 7 4 6 3 4
n .1 .0 .1 .0 .1 .1 .1 .1 .1
irsseeop .(2820± .(4520± .(2450± .(3000± .447(0± .(3030± .(3310± .(3370± .(3330± lliscoohm .(4730± .769(0± .(3780± .(3650± .(7650± .(3020± .(4870± .(3830± .(3460± ftrsaom ,,)S–E rrseeSp
D 0 0 0 0 0 0 0 0 0 A 0 0 0 0 0 0 0 0 0 s G( ,E
o
p N le
R n
itan .816 .847 .172 .603 .387 .783 .884 .293 .683 STD .423 .075 .192 .303 .806 .422 .563 .623 .072 TU( EB ,aS
E 0 0 0 0 0 0 0 0 0 P 0 0 0 0 0 0 0 0 0 – s ,
) a E
n Me
.)91 .)02 .)71 .)51 .)31 .)90 .)31 .)01 .)70 rsee iSLT .roF
0 0 0 0 0 0 0 0 0 ep B R</p>
        <p>r +
(± (± (± (± (± (± (± (± (± ) T d</p>
        <p>L</p>
        <p>n on
E e i
D s t</p>
        <p>e a
) ) ) ) ) ) ) ) ) O r
7 6 9 0 2 0 4 7 1 p in
S 0 0 0 0 0 0 0 0 0 B 0 0 0 0 0 0 0 0 0 e n + th
r o Mi
.
s
r
e
y
a
l
t
n
e
m
. it
s n
h e
p S
a
r d
g n</p>
        <p>a
y
l n
p o
e i
R to
d m
n E
a
, g
t</p>
        <p>n
n i
e v
im ah
t
n h
n e p
u
t
e R a
rep l-2</p>
        <p>a
l C p
l
a G m
s + a
t x
)
E
)
S )</p>
        <p>E ) ) R
)
(E ) ) ( R R S )</p>
        <p>S R N (E (S (E E
N ( (</p>
        <p>C
C )
M U ) C N N N N N M ) ) N N N N N N N P en G y</p>
        <p>C
( (T ll G C C +G C C C ( U ll C C C C C C C ts se ph la
lse M(A +M+G +G M+G +G +G lse (T (A +G +G +G +G +G +G +G so rp ra lex
odM iSLT STM iSLT STMSTM SLT TMTMTM od TMTM TMTMTM TMTMTMTM tpu l)re tgu litp</p>
        <p>i S S S M S S S S S S S S S p l p u
littsaku +ETRBB iL+ETRB +ETRBB iL+ETRB iL+ETRB +ETRBB iL+ETRB iL+ETRB iL+ETRB littsaku iL+ETRB iL+ETRB iL+ETRB iL+ETRB iL+ETRB iL+ETRB iL+ETRB iL+ETRB iL+ETRB i*enTh A(adn ie$nTh e+Thm
M * B $ B B + B B B M B B B B B B B B B</p>
        <p>) ) ) R h ll
) ) ) S R R S t</p>
        <p>a
E S R E E S E to tsa to rs
( ( ( ( ( ( (</p>
        <p>e
p c T w
re a S d
is in iL e</p>
        <p>t</p>
        <p>B n
LE tss + e</p>
        <p>T s
D o R re
O p E p
e B e</p>
        <p>r
th re</p>
        <p>M h e
) e t
Mental Health (MH) classification task Performance (F1-score). Bold indicates top-performing models across individual MH
categories and Macro-F1 scores. Mean results for 10-fold cross-validation were reported with standard deviations.
SModels</p>
        <p>R</p>
        <p>IE
SModels</p>
        <p>R
IS
SModels</p>
        <p>R</p>
        <p>IE
SModels</p>
        <p>R
IS
H</p>
        <p>TU</p>
        <p>E</p>
        <p>ES</p>
        <p>ER</p>
        <p>SR</p>
        <p>ESR</p>
        <p>H</p>
        <p>TU</p>
        <p>All</p>
        <p>E</p>
        <p>ES</p>
        <p>ER</p>
        <p>SR</p>
        <p>ESR
H</p>
        <p>TU</p>
        <p>E</p>
        <p>ES</p>
        <p>ER</p>
        <p>SR</p>
        <p>ESR</p>
        <p>H</p>
        <p>TU</p>
        <p>All</p>
        <p>E</p>
        <p>ES</p>
        <p>ER</p>
        <p>SR</p>
        <p>ESR
H</p>
        <p>TU</p>
        <p>All</p>
        <p>E</p>
        <p>ES</p>
        <p>ER</p>
        <p>SR</p>
        <p>ESR</p>
        <p>ESR</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>T.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. M.</given-names>
            <surname>Schoene</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Ji</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Ananiadou</surname>
          </string-name>
          ,
          <article-title>Natural language processing applied to mental illness detection: a narrative review</article-title>
          ,
          <source>NPJ digital medicine 5</source>
          (
          <year>2022</year>
          )
          <fpage>46</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>D.</given-names>
            <surname>Naskar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. R.</given-names>
            <surname>Singh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Kumar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Nandi</surname>
          </string-name>
          , E. O. d. l. Rivaherrera,
          <article-title>Emotion dynamics of public opinions on twitter</article-title>
          ,
          <source>ACM Transactions on Information Systems (TOIS) 38</source>
          (
          <year>2020</year>
          )
          <fpage>1</fpage>
          -
          <lpage>24</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>Z. P.</given-names>
            <surname>Jiang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. I.</given-names>
            <surname>Levitan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Zomick</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Hirschberg</surname>
          </string-name>
          ,
          <article-title>Detection of mental health from reddit via deep contextualized representations</article-title>
          ,
          <source>in: Proceedings of the 11th International Workshop on Health Text Mining and Information Analysis</source>
          ,
          <year>2020</year>
          , pp.
          <fpage>147</fpage>
          -
          <lpage>156</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>A.</given-names>
            <surname>Cohan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Desmet</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Yates</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Soldaini</surname>
          </string-name>
          , S. MacAvaney, N. Goharian,
          <article-title>Smhd: a large-scale resource for exploring online language usage for multiple mental health conditions</article-title>
          ,
          <source>in: Proceedings of the 27th International Conference on Computational Linguistics</source>
          ,
          <year>2018</year>
          , pp.
          <fpage>1485</fpage>
          -
          <lpage>1497</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>A.</given-names>
            <surname>Tsakalidis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Nanni</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Hills</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Chim</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Song</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Liakata</surname>
          </string-name>
          ,
          <article-title>Identifying moments of change from longitudinal user text</article-title>
          ,
          <source>in: Annual Meeting of the Association for Computational Linguistics</source>
          ,
          <year>2022</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>T.</given-names>
            <surname>Azim</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Singh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Middleton</surname>
          </string-name>
          ,
          <article-title>Detecting moments of change and suicidal risks in longitudinal user texts using multi-task learning</article-title>
          ,
          <source>in: Proceedings of the Eighth Workshop on Computational Linguistics and Clinical Psychology</source>
          ,
          <year>2022</year>
          , pp.
          <fpage>213</fpage>
          -
          <lpage>218</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>D.</given-names>
            <surname>Ghosal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Majumder</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Poria</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Chhaya</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Gelbukh</surname>
          </string-name>
          ,
          <article-title>Dialoguegcn: A graph convolutional neural network for emotion recognition in conversation</article-title>
          ,
          <source>in: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)</source>
          ,
          <year>2019</year>
          , pp.
          <fpage>154</fpage>
          -
          <lpage>164</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>R.</given-names>
            <surname>Sawhney</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Agarwal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. T.</given-names>
            <surname>Neerkaje</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Aletras</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Nakov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Flek</surname>
          </string-name>
          ,
          <article-title>Towards suicide ideation detection through online conversational context</article-title>
          ,
          <source>in: Proceedings of the 45th international ACM SIGIR conference on research and development in information retrieval</source>
          ,
          <year>2022</year>
          , pp.
          <fpage>1716</fpage>
          -
          <lpage>1727</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>L. G.</given-names>
            <surname>Singh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. R.</given-names>
            <surname>Singh</surname>
          </string-name>
          ,
          <article-title>Sentiment analysis of tweets using text and graph multi-views learn-</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [30]
          <string-name>
            <surname>D. M. Low</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          <string-name>
            <surname>Rumker</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          <string-name>
            <surname>Talkar</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Torous</surname>
            , G. Cecchi,
            <given-names>S. S.</given-names>
          </string-name>
          <string-name>
            <surname>Ghosh</surname>
          </string-name>
          ,
          <article-title>Natural language processing reveals vulnerable mental health support groups and heightened health anxiety on reddit during covid19: Observational study</article-title>
          ,
          <source>Journal of medical Internet research 22</source>
          (
          <year>2020</year>
          )
          <article-title>e22635</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [31]
          <string-name>
            <given-names>J. A.</given-names>
            <surname>Russell</surname>
          </string-name>
          ,
          <article-title>A circumplex model of afect</article-title>
          .,
          <source>Journal of personality and social psychology 39</source>
          (
          <year>1980</year>
          )
          <fpage>1161</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [32]
          <string-name>
            <given-names>P.</given-names>
            <surname>Bojanowski</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Grave</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Joulin</surname>
          </string-name>
          , T. Mikolov,
          <article-title>Enriching word vectors with subword information, Transactions of the association for computational linguistics 5 (</article-title>
          <year>2017</year>
          )
          <fpage>135</fpage>
          -
          <lpage>146</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [33]
          <string-name>
            <given-names>N.</given-names>
            <surname>Reimers</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Gurevych</surname>
          </string-name>
          ,
          <article-title>Sentence-bert: Sentence embeddings using siamese bert-networks</article-title>
          ,
          <source>in: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)</source>
          ,
          <year>2019</year>
          , pp.
          <fpage>3982</fpage>
          -
          <lpage>3992</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [34]
          <string-name>
            <given-names>F.</given-names>
            <surname>Barbieri</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Camacho-Collados</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L. E.</given-names>
            <surname>Anke</surname>
          </string-name>
          , L. Neves, Tweeteval:
          <article-title>Unified benchmark and comparative evaluation for tweet classification</article-title>
          ,
          <source>in: Findings of the Association for Computational Linguistics: EMNLP</source>
          <year>2020</year>
          ,
          <year>2020</year>
          , pp.
          <fpage>1644</fpage>
          -
          <lpage>1650</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [35]
          <string-name>
            <given-names>S.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Zheng</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Hu</surname>
          </string-name>
          ,
          <string-name>
            <surname>M.</surname>
          </string-name>
          <article-title>Yang, Bidirectional long short-term memory networks for relation classification</article-title>
          ,
          <source>in: Proceedings of the 29th Pacific Asia conference on language, information and computation</source>
          ,
          <year>2015</year>
          , pp.
          <fpage>73</fpage>
          -
          <lpage>78</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [36]
          <string-name>
            <given-names>K.</given-names>
            <surname>Kawakami</surname>
          </string-name>
          ,
          <article-title>Supervised sequence labelling with recurrent neural networks</article-title>
          ,
          <source>Ph. D. thesis</source>
          (
          <year>2008</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [37]
          <string-name>
            <given-names>A.</given-names>
            <surname>Vaswani</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Shazeer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Parmar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Uszkoreit</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Jones</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. N.</given-names>
            <surname>Gomez</surname>
          </string-name>
          , Ł. Kaiser,
          <string-name>
            <surname>I. Polosukhin</surname>
          </string-name>
          ,
          <article-title>Attention is all you need</article-title>
          ,
          <source>Advances in neural information processing systems</source>
          <volume>30</volume>
          (
          <year>2017</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [38]
          <string-name>
            <surname>T.-Y. Lin</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Goyal</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          <string-name>
            <surname>Girshick</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          <string-name>
            <surname>He</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Dollár</surname>
          </string-name>
          ,
          <article-title>Focal loss for dense object detection</article-title>
          ,
          <source>in: Proceedings of the IEEE international conference on computer vision</source>
          ,
          <year>2017</year>
          , pp.
          <fpage>2980</fpage>
          -
          <lpage>2988</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [39]
          <string-name>
            <given-names>D. P.</given-names>
            <surname>Kingma</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Ba</surname>
          </string-name>
          ,
          <article-title>Adam: A method for stochastic optimization</article-title>
          ,
          <source>in: 3rd International Conference on Learning Representations, ICLR</source>
          <year>2015</year>
          , San Diego, CA, USA, May 7-
          <issue>9</issue>
          ,
          <year>2015</year>
          , Conference Track Proceedings,
          <year>2015</year>
          . URL: http://arxiv.org/abs/1412.6980.
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>