<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Analysis of Global Word Representations for Depression Detection</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Niveditha Sekar</string-name>
          <email>nivedithasekarit@gmail.com</email>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>S Chandrakala</string-name>
          <email>chandrakala@cse.sastra.edu</email>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>G Prakash</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Department of CSE, Amrita School of Engineering</institution>
          ,
          <addr-line>Bengaluru, Amrita Vishwa Vidyapeetham</addr-line>
          ,
          <country country="IN">India</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>ISIC'21: International Semantic Intelligence Conference</institution>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Intelligent Systems Lab, School of Computing, SASTRA Deemed to be University</institution>
          ,
          <addr-line>Thanjavur</addr-line>
          ,
          <country country="IN">India</country>
        </aff>
      </contrib-group>
      <fpage>136</fpage>
      <lpage>148</lpage>
      <abstract>
        <p>Social media such as Twitter, Facebook, Google plus, Reddit, Tumblr have been a widely used platform for people to communicate, share views and feelings with others freely. The information obtained from this short text messages helps in predicting their emotions, views, sentiment, opinion and it is applied in different fields like marketing, election, product review, sentiment analysis, emotion detection etc. Behavioral analysis from text data is another widely popular field. This paper gives an analysis of global word representations and overview of the work done on depression detection related tasks. Major steps such as preprocessing of data, feature extraction, representation and classification methods are summarized.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Social media</kwd>
        <kwd>depression detection</kwd>
        <kwd>behavioral analysis</kwd>
        <kwd>emotion detection</kwd>
        <kwd>GloVe representation</kwd>
        <kwd>deep learning</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        Behavioral analysis is the study of human
behavior. It involves observing the behavior,
identifying the mental state, analyzing and
understanding the change in human behavior.
Behavioral analysis is also called as
emotional/sentimental analysis. Among
several emotions, the crucial ones are with
negative emotions. Some negative emotions
are stress, depression, frustration, hate, envy,
anger, anxiety, boredom and panic. These
emotions may affect the mental health as well
as physical health of a person. In which,
depression is a persistent mood disorder and in
the worst case, it can be a life-threatening one.
So it is essential to identify the people at the
risk of depression. Face-to-face interviews and
a set of questionnaire are used by Psychiatrist,
to understand the behavioral health of the
person. It provides a more accurate result, but
few people are not aware of abnormalities in
their mental health to consult a Psychiatrist. In
order to address this, the Depression can be
detected from the social media data of the
users itself [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. Since most of the people
around the world are using social media like
Facebook, Twitter, Instagram etc. Depression
can be detected from their text messages,
status updates, posts they are sharing,
selfreported surveys and the communities or pages
they are following [
        <xref ref-type="bibr" rid="ref2 ref3 ref4">2-4</xref>
        ].
      </p>
      <p>
        This analysis can be done from text
data, speech/audio data and visual data [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ].
The data for this analysis can be collected
from any social media. Since most of the user
prefers to share short text messages on the
events happening around them or information
about them, it is more informative to analyze
the social media text data. This sentimental
analysis is very popular since it is needed in
wide application areas of marketing, artificial
intelligence, political science,
humancomputer interaction, psychology, stock
market prediction etc. Figure 1 shows the flow
diagram of depression detection system.
      </p>
    </sec>
    <sec id="sec-2">
      <title>2. Challenges in short text data analysis</title>
      <p>
        Text data which is collected from any social
media does not have a structure. Each user
expresses his/her view in different ways and
their text includes new words, short form of
words, errors in the spelling of the words etc.
[
        <xref ref-type="bibr" rid="ref7">7</xref>
        ], [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. It is difficult to detect depression from
a single tweet of a user. Thus, we need to
observe a history of tweets of a particular user
[
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]. Also, there is a word limit for twitter
tweets. Within 140 characters it is hard to
express one’s feelings and also it is
the analyst to interpret their feelings [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]. In
order to identify their emotion, they have to
analyze the comment and retweets about that
particular tweet. This is a long chain process,
to detect the emotion of a particular user. A
large collection of tweets, from the history of
that particular user have to be taken into
account and also the comments, retweets for
each tweet by the user have to be considered
for this emotion detection.
      </p>
      <p>
        A typical social media user used to share
information about them in any of this form text
messages, photos or videos. They share
information in a consistent manner. The
opposite of this is also true i.e. users who are
under stress or depression are not much
interested to have communication on social
media [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]. This low activeness in social
media results in lesser tweets and thereby it is
difficult to identify the emotion of the user
with accuracy. The main task in the emotion
analysis is to understand the semantic nature
of the short text messages. Most of the features
identified from the short text or tweet are
sparse features. It is really challenging to
detect the emotion from those sparse features
since they contribute very less value in the
detection of emotion [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ]. In a word-level
representation, most of the words identified
are ambiguous, and they also contain stop
words. Hence, it is difficult to identify their
emotion class label by a classifier [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ]. It is
also difficult to identify the original meaning
of the sentence, when it has a sarcastic tone.
Since the sentences may sound joyful, but they
actually express sadness. It leads to false
positive in the result [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ], [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ].
      </p>
    </sec>
    <sec id="sec-3">
      <title>3. Short text depression detection datasets for</title>
      <p>
        The short text dataset can be collected through
the Twitter public API [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ], [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ] or
through the short text datasets, which are
already available [
        <xref ref-type="bibr" rid="ref15 ref16 ref17">15-17</xref>
        ]. Twitter public API
provides a means to access the twitter software
platform. Several software libraries are
hard faovrailable for each programming language
namely tweepy for Python and rtweet for R.
Twitter API is of two types, they are Twitter
REST API and Twitter Streaming API. Twitter
Streaming API [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ], [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ] will provide live
tweets until you stop, whereas REST API will
provide historical data. Table 1 lists few short
text datasets, which are used in depression
detection literature.
      </p>
      <sec id="sec-3-1">
        <title>Dataset name</title>
      </sec>
      <sec id="sec-3-2">
        <title>CLPsych dataset</title>
      </sec>
      <sec id="sec-3-3">
        <title>BellLetsTalk campaign dataset</title>
      </sec>
      <sec id="sec-3-4">
        <title>CLEF/eRisk 2017 dataset</title>
      </sec>
      <sec id="sec-3-5">
        <title>Sina weibo dataset</title>
      </sec>
      <sec id="sec-3-6">
        <title>LiveJournal dataset</title>
      </sec>
      <sec id="sec-3-7">
        <title>SemEval 2007 dataset</title>
      </sec>
      <sec id="sec-3-8">
        <title>ISEAR dataset</title>
      </sec>
      <sec id="sec-3-9">
        <title>Description</title>
      </sec>
      <sec id="sec-3-10">
        <title>1,746 twitter users examples, in which 246 are</title>
      </sec>
      <sec id="sec-3-11">
        <title>PTSD users and 327 are depressed users.</title>
        <p>All tweets with #BellLetsTalk hashtag are collected,
in which 95 people disclosed that they are
depressed.
887 Reddit users examples, in which 135 are
depressed.
23,304 users tweets are crawled, in which
11,074 users are stressed.</p>
      </sec>
      <sec id="sec-3-12">
        <title>This dataset consists of 2,132 posts. In which, 758 are depressed posts.</title>
      </sec>
      <sec id="sec-3-13">
        <title>This dataset consists of 1,250 news headlines. They are labelled into 6 emotions.</title>
      </sec>
      <sec id="sec-3-14">
        <title>This dataset contains 7,666 sentences. They are labelled into 7 emotions.</title>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Pre-processing the short text</title>
      <p>Before feature selection, the short text data is
pre-processed to refine the unstructured and
noisy data. Pre-processing phase is an
important phase, as it helps in improving the
performance.

</p>
      <sec id="sec-4-1">
        <title>Also negative references are replaced by</title>
        <p>their full words i.e. “can’t” is replaced by
“cannot”.</p>
        <p>Emoticons and emojis are replaced with
their words.
5. Feature
representation
extraction
and



</p>
      </sec>
      <sec id="sec-4-2">
        <title>In the pre-processing phase, all the non</title>
        <p>
          ASCII, non-English characters, URLs and 5.1. Feature extraction
@username are removed. Since they are
not contributing any valued information to From the pre-processed data, the features are
the depression detection system. extracted, represented and are given as input to
All the acronyms are expanded to its full the classification methods. There are several
form like “idk” as “I don’t know”. features or attributes involved in the process of
depression detection. Some of the features
This phase performs tokenizing, stemming used for this depression detection are
userand removing stop words [
          <xref ref-type="bibr" rid="ref17">17</xref>
          ], [
          <xref ref-type="bibr" rid="ref20">20</xref>
          ], [
          <xref ref-type="bibr" rid="ref21">21</xref>
          ]. level feature, tweet-level feature, temporal
Tokenizing process will split the texts into feature, non-temporal feature, social
sequence of tokens. Stemming process interaction feature, content feature, posting
will reduce the length of a word, by behavior feature, term frequency feature,
Bagreducing the word to its word stem like Of-Words (BOW) feature, hashtags, negation,
“rained”, ”raining” as “rain”. Stop words LIWC feature, word N-gram feature,
Part-ofare removed, some of them are “a”, “the”,
        </p>
        <p>
          speech (POS) feature, topic, tweet frequency,
“and” etc. RT [
          <xref ref-type="bibr" rid="ref24">24</xref>
          ] etc. Several feature extraction
In each word, if a letter is appearing techniques are available as built-in commands
continuously more than twice then it is in R language, SciPy, Numpy etc.
replaced with its appropriate word [
          <xref ref-type="bibr" rid="ref22">22</xref>
          ], The tweet-level attributes will give
[
          <xref ref-type="bibr" rid="ref23">23</xref>
          ] like “Noooooo” as “No”. information from the tweet, image, retweets,
comments and likes. The user level attributes model which will map words with similar
will provide more information on the emotion context into a feature vector [
          <xref ref-type="bibr" rid="ref28">28</xref>
          ]. The GloVe
of the user; it includes the behavior of the user representation model proves to be effective
from their social interaction and from their and is showing improved performance when
posts. The social interaction attributes have combined with Deep Convolutional Neural
information about the content and the structure Network, than the state-of-art approaches [
          <xref ref-type="bibr" rid="ref28">28</xref>
          ],
in which the user communicates with his [
          <xref ref-type="bibr" rid="ref40">40</xref>
          ].
friends [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ], [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ]. Tweets are classified in time
series for temporal feature, whereas history of
tweets is used in non-temporal feature. Term 6. Depression detection methods
frequency feature gives the frequency count of
individual word or n-gram of words. POS The extracted features and derived
feature finds the adjectives since they provide representations are fed as input for further
more information. Negation feature gives the modeling. Depression can be detected from the
actual opinion orientation like “not happy” is short text data with the help of various
equivalent to “sad[”25]. Bag-Of-Words will modeling methods, such as Discriminative
provide the occurrence of each word in a model based methods, Ensemble model based
document. Word N-grams feature is similar to methods, Probabilistic model based methods,
Bag-Of-Words. N-gram includes phonemes, ANN based methods, Deep learning based
syllables, letters, words [
          <xref ref-type="bibr" rid="ref16">16</xref>
          ]. To reduce methods and Unsupervised learning based
dimension or attributes Principal Component methods.
        </p>
        <p>
          Analysis (PCA) is used [
          <xref ref-type="bibr" rid="ref26">26</xref>
          ].
        </p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>6.1. Discriminative</title>
      <p>methods
model
based
5.2.</p>
    </sec>
    <sec id="sec-6">
      <title>Representation</title>
      <p>There are several feature representation
models are available. Some of the
representation models are Word2Vec
representation, FastText, Global vector for
word representation (GloVe) model, word
Ngram feature representation, twitter specific
feature representation, word sentiment polarity
score representation, word representation
features, temporal feature vector, non-temporal
feature vector etc.</p>
      <p>
        Word2Vec representation uses
continuous skip gram and BOW features.
Based on non-temporal feature, overall
emotion score is calculated. For temporal
feature, if a user did not tweet anything for a
day its score is taken as zero. In such a way,
emotion score vector is calculated [
        <xref ref-type="bibr" rid="ref27">27</xref>
        ]. In
word embedding, all the words are mapped
into a multi-dimensional vector, where
semantically related words are neighbors. The
word sentiment polarity score representation,
finds either the word has a strong relationship
with positive sentiment or non-positive
sentiment. To identify this, it uses the lexicon
based sentiment feature and Senti-wordnet.
FastText is similar to skip-gram
representation, where each ngram has its
vector. Vector representation helps to improve
the performance, as it provides the hidden
details [
        <xref ref-type="bibr" rid="ref36">36</xref>
        ]. The GloVe model is a regression
SVM is a discriminative classifier. SVM is
most suited for text data, because of the sparse
nature of the text. Text data can be categorized
into two categories. They are user-level
attributes and tweet-level attributes. In
tweetlevel category, first the features are extracted,
next the features are segregated into different
classes like depressed words, non-depressed
words, polarity words, stop words etc. In
userlevel category, the user tweet history is
considered. All the tweets of the user are
considered like a single tweet and then
tweetlevel detection is performed. It uses (BOW) to
get the vocabulary. Then it is trained using
SVM in original dataset, dataset balanced by
under-sampling and dataset balanced by
oversampling. It is observed that user-level
classification gives high performance with
respect to recall measure in comparison with
the tweet-level classification even for the
limited number of feature. Also it is difficult to
detect whether the user is depressed or not
from a single tweet/post, hence user-level
category is used [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]. It is also observed that
when Linear SVM is applied on BOW feature,
it provides good performance in terms of
Recall measure [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ]. SVM gives good
accuracy when compared with Naïve Bayes
and Logistic regression methods [
        <xref ref-type="bibr" rid="ref29">29</xref>
        ]. Table 2
gives the summary of few
modeling.
discriminative
      </p>
    </sec>
    <sec id="sec-7">
      <title>6.2. Ensemble methods model based</title>
      <p>
        Random Forest (RF) classifier is an ensemble
classifier. It is a multitude of decision tree, for
more accurate results. To detect depression
from the text data, temporal feature and
nontemporal feature are used. Feature vector from
non-temporal feature is referred as EMO.
EMO, LIWC and combination of EMO+LIWC
feature sets are given as input to Random
Forest classifier. It is observed that RF gives
high precision and recall than SVM [
        <xref ref-type="bibr" rid="ref30">30</xref>
        ]; also
it provides more information with temporal
features [
        <xref ref-type="bibr" rid="ref27">27</xref>
        ]. RF classifier is also used to
classify the online post and communities into
depressive and non-depressive. On top of the
extracted LIWC feature, RF is applied to
classify them. Hierarchical HMM is used for
determining the degree of depression in the
social communities. RF, Logistic Regression,
and Gaussian NB are applied with different
representation methods such as Word2Vec,
FastText with Skip-gram, and GloVe. RF
provides better performance than the other
models when combined with FastText [
        <xref ref-type="bibr" rid="ref36">36</xref>
        ].
Table 3 gives the summary of few ensemble
modeling.
      </p>
    </sec>
    <sec id="sec-8">
      <title>6.3. Probabilistic methods model based</title>
      <p>
        Naïve Bayes is a probability based classifier.
Naïve Bayes algorithm has assumes that each
feature is independent. Bag-Of-Words (BOW)
approach will provide the words with its
occurrence frequency. BOW feature is given
as input to different classification algorithms
like DT, NB, Linear SVM and Logistic
Regression. Each tweet is treated as a
document. Here Bag-Of-Words finds the
occurrence frequency of words related to
depression. Decision tree will provide results
for most of the cases, but it may be unstable
when there is a change in data. Linear SVM is
also used for this purpose, where a straight line
is used to differentiate classes. It uses a
maximum-margin hyperplane to perform this
identification of classes. Logistic Regression
uses the probability of words belonging to a
particular class and curve is drawn to identify
the best fit for the depression case. Here Naïve
Bayes theorem shows better performance with
respect to accuracy when compared with other
classifier algorithms. When evaluating with
respect to Precision and F1-score Logistic
Regression gives good performance [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ].
Also, Naïve Bayes is the best classification
approach when compared with BP neural
network and Decision tree. Also, Naïve Bayes
gives high precision and recall value [
        <xref ref-type="bibr" rid="ref26">26</xref>
        ].
Table 4 gives the summary of few
probabilistic modeling.
      </p>
    </sec>
    <sec id="sec-9">
      <title>6.4. Artificial Neural</title>
      <p>(ANN) based methods</p>
    </sec>
    <sec id="sec-10">
      <title>Network</title>
      <p>
        Artificial Neural Network (ANN) is combined
with several unsupervised learning model to
detect depression from social media text data.
Some of the unsupervised learning models are
Biterm Topic model, Word2vec, Replicated
Softmax Machine. BTM identifies words that
appear together. It will identify two words that
appear together, if the size of window is given
as two. BTM uses topic to represent the hidden
aspects of the document. Word2vec is a word
embedding process, identifies both semantic
and syntactic regularities in the sentence. And
it will group them in clusters, if the vectors
have similar semantic meanings i.e. it
computes the association with words and
groups them together. RSM is similar to term
frequency counter, it will count the
occurrences of a particular word in the
vocabulary collected. RSM also identifies the
hidden topical structure. On top of this
unsupervised learning model, Stochastic
Gradient Descent (SGD) model is applied.
SGD acts as transfer learning approach, as this
will transfer the high-level semantic features to
ANN. In order to filter the noisy feature and to
maintain the stability of this model, Sparse
Encoding method is applied. The transfer
learning approach used in this Hybrid Neural
Network is called as Latent Semantic Machine
(LSM). It accepts the raw features from the
unsupervised learning models and derives
them into high level semantic feature mixture,
which will be fed into the Neural network. It is
observed that HNN+BTM with one LSM and
HNN+BTM with two LSM performed better
in terms of F1 measure, than HNN with other
unsupervised learning models. It is also
observed that HNN+RSM and
MultiChannelPoolingCNN is same as
MultiChannelCNN but with two different
max-pooling sizes 2 and 5. MutiChannel CNN
and bi-directional GRU combined to give
more accuracy than CNN [
        <xref ref-type="bibr" rid="ref38">38</xref>
        ]. These CNN
variants are compared with the RNN model
and it is observed that CNN with global max
pooling layer gives high performance than
RNN based model by providing the highest
precision and recall [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ].
      </p>
      <p>
        HNN+Word2vec with sparse encoding give
better performance than HNN+RSM and
HNN+Word2vec without sparse encoding.
The selection of the unsupervised learning
models for extracting the source features
added more value to this HNN model [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ].
      </p>
      <p>Feed Forward (FF) is a type of ANN.</p>
      <p>
        The Reddit dataset is pre-processed and fed to
FF neural network method. This FF modeling
is used for multiclass classification, which
involves “selfharm”, “suicidewatch”,
“anxiety”, “depression”, etc. It is observed that
FF classifier gives more accurate results when
compared with SVM and linear regression
[
        <xref ref-type="bibr" rid="ref31">31</xref>
        ]. Table 5 gives the summary of few ANN
based modeling.
      </p>
    </sec>
    <sec id="sec-11">
      <title>6.5. Deep learning based methods</title>
    </sec>
    <sec id="sec-12">
      <title>6.5.1. Convolutional Neural</title>
    </sec>
    <sec id="sec-13">
      <title>Networks (CNN)</title>
      <p>CNN with Factor Graph Model (FGM).</p>
      <p>CNN is combined with the Factor Graph
Model (FGM) to extract more tweet level and
user level information. In this approach, CNN
method is applied on the dataset along with the
Cross Auto Encoders (CAE). CNN will
provide the user-level attributes, which is
obtained from tweet-level. Then this will be
given as input to the next phase FGM. FGM
considers three factors and three aspects of this
attributes to map this into states. The three
factors are attribute factor, dynamic factor and
social factor. To depict the correlation of the
stress state and time with attribute, attribute
factor is used. Dynamic factor is used to give
correlation of the stress state and dynamic
time. Social factor is used to depict the
correlation between the stress state and time
with polarity comments. The three main
aspects, which FGM is taking into account, are
the following user-level attributes: posting
behavior, content, and social interaction.</p>
      <p>
        Based on these factors and aspects the
userlevel attributes are mapped with the respective
stress state level. This CNN+FGM give better
performance, by providing the highest
precision and recall, when compared with the
traditional methods like SVM, RF, LR [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ].
      </p>
      <p>CNN with global max pooling layer. The
preprocessing of twitter data provides a
vocabulary for further phases. The words are
encoded into a sequence of fixed length and
occurrence of the word is limited to two times
in that sequence. Then unsupervised training
models are used to transform the encoded
words into a low dimensional vector. For this
many models are available like Skip-gram and
CBOW. Skip-gram concentrates on the
contextual words and is able to detect rare
words whereas the CBOW concentrates on the
current words and is much of a continuous
Skip-gram. Skip-gram and CBOW are the two
layers of Word2Vec model. This unsupervised
training model is performed with different
sense and it involves two tasks. They are
predicting word and sense from the input. For DCNN with Global vector for word
this, it first identifies the word that occurs representation (GloVe) model. DCNN method
together, example “happy” it can come with helps to identify whether the tweets express
words like “journey”, “morning”, “birthday”. positive or non-positive emotion. Before
Then Rectified Liner Unit (ReLU) is used, this applying Deep Convolutional Neural network,
will identify the label for the missing data and the tweets are preprocessed, features are
sense of the sentence, thereby produce the extracted and represented into feature vector
label output. using GloVe model. The GloVe model is a</p>
      <p>
        On top of these embeddings, variants of regression model which combines the
CNN are applied. CNNWithMAX means following two methods local context window
Convolution with 250 layers is applied and and global matrix factorization. Deep
then the global max pooling layer is applied to Convolutional Neural Networks (DCNN) is
extract the global information. In applied on the vector, generated by GloVe
MultiChannelCNN, three times CNN is model. The twitter specific feature vector,
applied, with the filter of length 3, 4, and 5. unigram and bigram feature vector, word
sentiment polarity score feature vector are
combined into a single feature vector. In the
first Convolutional layer, on top of the
combined feature vector, Convolutional filter
is applied to get new vector. The vector is
mapped to a fixed length vector. Again
convolutional layer is applied to get new
vector. This GloVe+DCNN model uses three
k-max pooling layer and three convolutional
layers to give the probability of positive or
negative sentiment in the tweet. It is observed
that GloVe+DCNN provide high precision and
recall when compared with BoW/GloVe with
SVM or LR [
        <xref ref-type="bibr" rid="ref28">28</xref>
        ].
      </p>
    </sec>
    <sec id="sec-14">
      <title>6.5.2. Recurrent Neural Networks (RNN)</title>
      <p>
        RNN is widely used in NLP. Word2Vec model
is used to represent the vocabulary. It also
helps to determine or predict the word and
sense from the input. Rectified Liner Unit
(ReLU) is also used. ReLU helps in
identifying the missing label for the data and
also identifies the sense of the sentence. This
embedding is given as input to RNN model.
RNN is applied with Bidirectional LSTM and
context-aware attention. LSTM prevents error
from exploding and vanishing gradient
problems. Bidirectional RNN connects the
output from two hidden layers of opposite
direction to the same output. Bidirectional
LSTM helps to concatenate both forward and
backward representation. Context-aware
attention provides the weighted sum of all
words in a sequence and also it helps to focus
on the more important words. It is observed
that optimized embedding performed better,
than the trainable random embedding for
RNN. Also, when compared with CNN based
models, RNN shows low performance with
respect to precision and recall [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ].
      </p>
      <p>
        LSTM and Gate Recurrent Unit (GRU).
LSTM and GRU are best suited for predicting
long-term data involving delay. Combining
GRU with LSTM helps in handling the
difficulties in LSTM, which is the training
speed. GloVe representation is used to utilize
both local and global details of the data.
Among RNN, LSTM, GRU and LSTM-GRU,
LSTM-GRU provides better performance [
        <xref ref-type="bibr" rid="ref37">37</xref>
        ].
Table 6 gives the summary of few deep
learning based modeling.
      </p>
    </sec>
    <sec id="sec-15">
      <title>6.6. Unsupervised learning based methods</title>
      <p>
        K-means is an unsupervised learning method.
Before applying k-means to the observations,
the collected data is pre-processed. Then the
data is analyzed, by calculating the word
frequency. Words in the vocabulary are
represented into vector, using one-hot
encoding or word embedding process.
Word2Vec model can also be used to generate
vectors. Then k-means clustering is applied,
words with similar meaning are grouped
together in clusters. Based on cosine similarity
it is easy to accumulate semantically similar
words in the clusters [
        <xref ref-type="bibr" rid="ref32">32</xref>
        ]. For Latent Dirichlet
Allocation (LDA) method, the extracted
Ngram features are fed as input. LDA is applied
on term-document matrix and gives output as
topic-document matrix, which is fed into
Multilayer Perceptron (MLP). MLP works
with 30 topics as input and two hidden layer of
60 and 30 units. It gives comparatively
moderate performance with respect to
precision and recall, which is due to the
unsupervised nature of the topic extraction
[
        <xref ref-type="bibr" rid="ref33">33</xref>
        ]. Table 7 gives the summary of few
unsupervised learning based modeling.
[
        <xref ref-type="bibr" rid="ref27">27</xref>
        ] Twitter Streaming API
[
        <xref ref-type="bibr" rid="ref30">30</xref>
        ] LiveJournal
      </p>
      <sec id="sec-15-1">
        <title>Feature</title>
      </sec>
      <sec id="sec-15-2">
        <title>Non-Temporal(EMO),</title>
      </sec>
      <sec id="sec-15-3">
        <title>Temporal(EMO-TS), LIWC feature</title>
      </sec>
      <sec id="sec-15-4">
        <title>LIWC feature</title>
        <p>
          [
          <xref ref-type="bibr" rid="ref36">36</xref>
          ] Twitter dataset
        </p>
      </sec>
      <sec id="sec-15-5">
        <title>Skip-gram</title>
        <p>
          [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ] Sina Weibo’s REST APIs, User-level feature, Tweet- CNN with FGM Precision-0.90,
        </p>
        <p>Tencent Weibo level feature Recall-0.96</p>
      </sec>
    </sec>
    <sec id="sec-16">
      <title>7.1. Dataset overview</title>
      <p>The dataset used in the following experimental
analyses is “CLEF/eRisk 201d8ataset”. The
aim of the CLEF eRisk is to identify the
people, liable to depression from the data
available on the Internet. It paved a way of</p>
      <sec id="sec-16-1">
        <title>Classifier</title>
      </sec>
      <sec id="sec-16-2">
        <title>K-means</title>
      </sec>
      <sec id="sec-16-3">
        <title>Performance</title>
      </sec>
      <sec id="sec-16-4">
        <title>Cosine similarity helps in easy clustering</title>
      </sec>
      <sec id="sec-16-5">
        <title>Latent Dirichlet Precision-0.32, Allocation (LDA) Recall-0.62</title>
        <p>interdisciplinary research in the field of
depression related problems. The people under
depression can be alerted when early signs of
depression are found. eRisk 2017 dataset
focussed on the early risk prediction with
multiple actors (Ex: Children sexual abuse)
and with single actors (Ex: Depression, bipolar
disorder, teenage distress) from online text
data. eRisk 2018 dataset is formed with the
2017 dataset, it involved in the early prediction
of Depression and Anorexia among the social
media users. Both eRisk 2017 and eRisk 2018
uses the same source of data, i.e. it collects the
social media texts from a particular collection
of users. The data is arranged in chronological
order of 10 chunks from oldest to newest of
each user. It provides data for both training
and testing. The training data is divided into
depressed and control groups i.e.,
nondepressed. The eRisk 2017 dataset is a
collection of writings from 887 social media
users, where 135 are depressed. The eRisk
2018 dataset is an extended collection of 2017
dataset, which consists of writings from 1,707
users, where 214 users are depressed.</p>
      </sec>
    </sec>
    <sec id="sec-17">
      <title>7.2. Methodologies used</title>
      <p>Analysis with TF-IDF representation and
LDA. The eRisk dataset is pre-processed as an
initial step. The TF-IDF vectorizer is well
suited for text dataset. As this will provide the
unique list of words used in the dataset, along
with their frequency of occurrence. It helps in
classifying the words under a particular set of
topics. The TF-IDF vectorizer of Scikit-learn
converts the writings of social media users into
a matrix of TF-IDF features. The terms
extracted using the TF-IDF vectorizer is
formed as a matrix and given as input to
Latent Dirichlet Allocation (LDA). The output
of LDA is the topic matrix. As each document
is composed of different topics or attributes.
And each topic is composed of different
words. This topic matrix is given as input to
the MLP model, it consists of two intermediate
layers of 50 &amp; 20 units. By this approach, each
user is labeled as depressed or not. The
performance of the TF-IDF and LDA model is
depicted in Table 8.</p>
      <p>Analysis with GloVe and RNN. GloVe
combines the advantage of methods local
context window and global matrix
factorization, to provide meaningful word
insights. The GloVe model is providing
promising results for text classification. It is
combined with the RNN model, as RNN
model widely used for text classification and
Natural Language Processing (NLP). The
eRisk dataset is pre-processed and tokenized
and then given as input to the GloVe
representation model. To provide meaningful
statistics, GloVe forms the word-to-word
cooccurrence matrix. The Resultant of GloVe is
given as input to the RNN model. It involves
two hidden layers of varying units. The output
layer of the RNN helps in labeling the users as
a depressed person or non-depressed person.
The performance of GloVe and RNN model is
depicted in Table 8.</p>
      <p>
        Analysis with GloVe and CNN. The GloVe
model is observed to be effective for sentiment
analysis from text data mining [
        <xref ref-type="bibr" rid="ref28">28</xref>
        ], the GloVe
representation model is combined with CNN
to analyze the result. The dataset taken for this
analysis consists of a few empty writings,
which are ignored. Then the dataset is
preprocessed while preserving the emoticons and
symbols since they provide valuable
information. Each user’s writing in each chunk
is analyzed and formed a matrix of words with
a pre-trained set of word embeddings. This
pre-processed tokenized input is given to a
single Convolutional layer of 100 filters with
CReLU activation. A Single max pooling layer
is applied to classify each user as depressed or
not. The GloVe model is combined with
different layers of CNN and LSTM network
and the performance is observed high for
GloVe with the multiple layers of CNN and
bi-LSTM [
        <xref ref-type="bibr" rid="ref39">39</xref>
        ]. The performance of GloVe
and CNN model is depicted in Table 8.
      </p>
    </sec>
    <sec id="sec-18">
      <title>7.3. Performance analysis</title>
      <p>The Classification report and Confusion
matrix is used to analyze the performance of
the above methodologies. The following
Table 8 shows Precision, Recall and F1 of
these three methods.</p>
      <p>The TF-IDF representation focuses
mainly on the frequency of word occurrence in
documents. Then maps the word into an
appropriate topic, thereby it classifies them. In
this case, whenever some word related to
depression comes, it classifies them as
depressed, which is not the ideal way. The
Global vector for word representation
considers the frequency of word occurrence
and the frequency of co-occurrence of words
thereby provides more valuable information in
classifying them. The GloVe representation
founds to be significantly better than the most
commonly used word representation TF-IDF.
The RNN and CNN classifier works well with
text representation and its performances are
analyzed with GloVe representation. From the
table, it is found that GloVe representation is
better than TF-IDF representation. Also,
GloVe representation performs well with CNN
than RNN. This is because the RNN model
gives better result with word embeddings of
higher length. From the analysis, it is found</p>
    </sec>
    <sec id="sec-19">
      <title>Future directions</title>
      <p>The performance of depression detection
system can be improved or made more
meaningful with the following directions for
future research.
 The depression detection task can also be
done by extracting the emotions from the
speech data.
 It can also be extended by grouping users,
based on gender, age, locations and other
demographic attributes.
 The Spatiotemporal features from video
data can also be included, as they
contribute more information.
 Daily variation of a user’s depression can
also be monitored.
 It can be extended by including the
medical context, so the clinical depression
can be detected from social media data.</p>
    </sec>
    <sec id="sec-20">
      <title>Conclusion</title>
      <p>This paper provided an overview of the
depression detection system and the analysis
of global word representations from the short
text data. Datasets and Machine learning
methods, used in recent years for the
depression detection are summarized. The
global word representations model proved to
be effective is analyzed with different
classifiers. Various challenges and future
directions are summarized for future research.</p>
      <sec id="sec-20-1">
        <title>Precision</title>
        <p>that GloVe representation with CNN classifier
provides comparatively better results.</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>Juyoung</given-names>
            <surname>Song</surname>
          </string-name>
          , Tae Min Song,
          <string-name>
            <surname>Dong-Chul Seo</surname>
          </string-name>
          , and Jae Hyun Jin.:
          <article-title>“Data Mining of Web-Based Documents on Social Networking Sites that Included SuicideRelated Words among Korean Adolescents”</article-title>
          .
          <source>Journal of Adolescent Health</source>
          ,
          <volume>59</volume>
          (
          <issue>6</issue>
          ):
          <fpage>668</fpage>
          -
          <lpage>673</lpage>
          2016.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>Budhaditya</given-names>
            <surname>Saha</surname>
          </string-name>
          , Thin Nguyen, Dinh Phung, and Svetha Venkatesh.:
          <article-title>“A Framework for Classifying Online Mental Health-Related Communities with an Interest in Depression”</article-title>
          .
          <source>IEEE Journal of Biomedical and Health Informatics</source>
          ,
          <volume>20</volume>
          (
          <issue>4</issue>
          ):
          <fpage>1008</fpage>
          -
          <lpage>1015</lpage>
          2016.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>Sharath</given-names>
            <surname>Chandra</surname>
          </string-name>
          <string-name>
            <given-names>Guntuku</given-names>
            , David B Yaden, Margaret L Kern,
            <surname>Lyle H Ungar</surname>
          </string-name>
          ,
          <article-title>and Johannes C Eichstaedt.: “Detecting Depression and Mental Illness on Social Media: An Integrative Review”</article-title>
          .
          <source>Current Opinion in Behavioral Sciences</source>
          ,
          <volume>18</volume>
          :
          <fpage>43</fpage>
          -
          <lpage>49</lpage>
          2017.
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <surname>Elizabeth</surname>
            <given-names>M Seabrook</given-names>
          </string-name>
          , Margaret L Kern,
          <string-name>
            <surname>Ben D Fulcher</surname>
          </string-name>
          , and Nikki S Rickard.: “
          <article-title>Predicting Depression from LanguageBased Emotion Dynamics: Longitudinal Analysis of Facebook and Twitter Status Updates”</article-title>
          .
          <source>Journal of Medical Internet research</source>
          ,
          <volume>20</volume>
          (
          <issue>5</issue>
          )
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>Guangyao</given-names>
            <surname>Shen</surname>
          </string-name>
          , Jia Jia, Liqiang Nie, Fuli Feng, Cunjun Zhang, Tianrui Hu, TatSeng Chua, and Wenwu Zhu.:
          <article-title>“Depression Detection via Harvesting Social Media: A Multimodal Dictionary Learning Solution”</article-title>
          .
          <source>In Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence (IJCAI-17)</source>
          , pages
          <fpage>3838</fpage>
          -
          <lpage>3844</lpage>
          ,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <surname>Andrew</surname>
            <given-names>G</given-names>
          </string-name>
          <string-name>
            <surname>Reece and Christopher M Danforth.</surname>
          </string-name>
          <article-title>: “Instagram Photos Reveal Predictive Markers of Depression”</article-title>
          .
          <source>EPJ Data Science</source>
          ,
          <volume>6</volume>
          (
          <issue>1</issue>
          ):
          <fpage>15</fpage>
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>Ahmed</given-names>
            <surname>Husseini</surname>
          </string-name>
          <string-name>
            <surname>Orabi</surname>
          </string-name>
          , Prasadith Buddhitha, Mahmoud Husseini Orabi, and Diana Inkpen.:
          <article-title>“Deep Learning for Depression Detection of Twitter Users”</article-title>
          .
          <source>In Proceedings of the Fifth Workshop on Computational Linguistics and Clinical Psychology: From Keyboard to Clinic</source>
          , pages
          <fpage>88</fpage>
          -
          <lpage>97</lpage>
          2018.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <surname>Harshali</surname>
            <given-names>P</given-names>
          </string-name>
          <string-name>
            <surname>Patil and Mohammad Atique</surname>
          </string-name>
          .: “
          <article-title>Sentiment Analysis for Social Media: A Survey”</article-title>
          .
          <source>In Information Science and Security (ICISS)</source>
          ,
          <year>2015</year>
          2nd International Conference on, pages
          <fpage>1</fpage>
          -
          <lpage>4</lpage>
          2015.
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>Zunaira</given-names>
            <surname>Jamil</surname>
          </string-name>
          .: “Monitoring Tweets for Depression to Detect At-risk Users”,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <surname>Huijie</surname>
            <given-names>Lin</given-names>
          </string-name>
          , Jia Jia, Jiezhong Qiu, Yongfeng Zhang, Guangyao Shen, Lexing Xie, Jie Tang, Ling Feng, and TatSeng Chua.
          <article-title>: “Detecting Stress Based on Social Interactions in Social Networks”</article-title>
          .
          <source>IEEE Transactions on Knowledge and Data Engineering</source>
          ,
          <volume>29</volume>
          (
          <issue>9</issue>
          ):
          <fpage>1820</fpage>
          -1833
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>Xiangsheng</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Yanghui</given-names>
            <surname>Rao</surname>
          </string-name>
          , Haoran Xie, Raymond Yiu Keung Lau, Jian Yin, and Fu Lee Wang.:
          <article-title>“Bootstrapping Social Emotion Classification with Semantically Rich Hybrid Neural Networks”</article-title>
          .
          <source>IEEE Transactions on Affective Computing</source>
          ,
          <volume>8</volume>
          (
          <issue>4</issue>
          ):
          <fpage>428</fpage>
          -
          <lpage>442</lpage>
          2017.
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <surname>Monireh</surname>
            <given-names>Ebrahimi</given-names>
          </string-name>
          , Amir Hossein Yazdavar, and Amit Sheth.:
          <article-title>“Challenges of Sentiment Analysis for Dynamic Events”</article-title>
          .
          <source>IEEE Intelligent Systems</source>
          ,
          <volume>32</volume>
          (
          <issue>5</issue>
          ):
          <fpage>70</fpage>
          -
          <lpage>75</lpage>
          2017.
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <surname>Zhaoxia</surname>
            <given-names>Wang</given-names>
          </string-name>
          , Chee Seng Chong, Landy Lan, Yinping Yang, Seng Beng Ho, and Joo Chuan Tong.: “
          <article-title>Fine-Grained Sentiment Analysis of Social Media with Emotion Sensing”</article-title>
          .
          <source>In Future Technologies Conference (FTC)</source>
          , pages
          <fpage>1361</fpage>
          -
          <lpage>1364</lpage>
          2016.
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <surname>Sara</surname>
            <given-names>Rosenthal</given-names>
          </string-name>
          , Noura Farra, and Preslav Nakov.: “SemEval
          <article-title>-2017 task 4: Sentiment Analysis in Twitter”</article-title>
          .
          <source>In Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017)</source>
          , pages
          <fpage>502</fpage>
          -
          <lpage>518</lpage>
          2017.
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>Moin</given-names>
            <surname>Nadeem</surname>
          </string-name>
          .: “Identifying Depression on Twitter”.
          <source>arXiv preprint arXiv:1607.07384</source>
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <surname>Maxim</surname>
            <given-names>Stankevich</given-names>
          </string-name>
          , Vadim Isakov, Dmitry Devyatkin, and Ivan Smirnov.: “Feature Engineering for Depression Detection in Social Media”
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <article-title>Maryam Mohammed Aldarwish and Hafiz Farooq Ahmad.: “Predicting Depression Levels Using Social Media Posts”</article-title>
          .
          <source>In Autonomous Decentralized System (ISADS)</source>
          ,
          <source>2017 IEEE 13th International Symposium on</source>
          , pages
          <fpage>277</fpage>
          -
          <lpage>280</lpage>
          2017.
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>Felix</given-names>
            <surname>Ming Fai Wong</surname>
          </string-name>
          , Chee Wei Tan,
          <string-name>
            <surname>Soumya Sen</surname>
          </string-name>
          , and Mung Chiang.:
          <article-title>“Quantifying Political Leaning from Tweets, Retweets, and Retweeters”</article-title>
          .
          <source>IEEE Transactions on Knowledge and Data Engineering</source>
          ,
          <volume>28</volume>
          (
          <issue>8</issue>
          ):
          <fpage>2158</fpage>
          -
          <lpage>2172</lpage>
          2016.
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <surname>Zhenhua</surname>
            <given-names>Zhang</given-names>
          </string-name>
          , Qing He,
          <string-name>
            <surname>Jing Gao</surname>
          </string-name>
          , and Ming Ni.:
          <article-title>“A Deep Learning Approach for Detecting Traffic Accidents from Social Media Data”</article-title>
          . Transportation Research Part C: Emerging Technologies,
          <volume>86</volume>
          :
          <fpage>580</fpage>
          -
          <lpage>596</lpage>
          2018.
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <given-names>C</given-names>
            <surname>Sindhu</surname>
          </string-name>
          , Dyawanapally Veda Vyas, and Kommareddy Pradyoth.: “
          <article-title>Sentiment Analysis Based Product Rating Using Textual Reviews”</article-title>
          .
          <source>In Electronics, Communication and Aerospace Technology (ICECA)</source>
          , 2017 International conference of, volume
          <volume>2</volume>
          , pages
          <fpage>727</fpage>
          -
          <lpage>731</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <surname>Anukarsh G Prasad</surname>
            ,
            <given-names>S Sanjana</given-names>
          </string-name>
          , Skanda M Bhat, and
          <string-name>
            <given-names>B S</given-names>
            <surname>Harish.</surname>
          </string-name>
          <article-title>: “Sentiment Analysis for Sarcasm Detection on Streaming Short Text Data”</article-title>
          .
          <source>In Knowledge Engineering and Applications (ICKEA)</source>
          ,
          <year>2017</year>
          2nd International Conference on, pages
          <fpage>1</fpage>
          -
          <lpage>5</lpage>
          2017.
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [22]
          <article-title>Sonia Xylina Mashal and Kavita Asnani.: “Emotion Intensity Detection for Social Media Data”</article-title>
          .
          <source>In Computing Methodologies and Communication (ICCMC)</source>
          , 2017 International Conference on, pages
          <fpage>155</fpage>
          -
          <lpage>158</lpage>
          2017.
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          [23]
          <string-name>
            <given-names>Ahmed</given-names>
            <surname>Al-Saffar</surname>
          </string-name>
          , Suryanti Awang, Hai Tao, Nazlia Omar, Wafaa Al-Saiagh, and
          <string-name>
            <surname>Mohammed</surname>
          </string-name>
          Al-bared.:
          <article-title>“Malay Sentiment Analysis Based on Combined Classification Approaches and SentiLexicon Algorithm”</article-title>
          .
          <source>PloS One</source>
          ,
          <volume>13</volume>
          (
          <issue>4</issue>
          ):e0194852
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          [24]
          <string-name>
            <surname>Sho</surname>
            <given-names>Tsugawa</given-names>
          </string-name>
          , Yusuke Kikuchi, Fumio Kishino, Kosuke Nakajima, Yuichi Itoh, and Hiroyuki Ohsaki.: “
          <article-title>Recognizing Depression from Twitter Activity”</article-title>
          .
          <source>In Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems</source>
          , pages
          <fpage>3187</fpage>
          -
          <lpage>3196</lpage>
          2015.
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          [25]
          <string-name>
            <surname>Amit</surname>
            <given-names>G</given-names>
          </string-name>
          <string-name>
            <surname>Shirbhate and Sachin N Deshmukh.</surname>
          </string-name>
          <article-title>: “Feature Extraction for Sentiment Classification on Twitter Data”</article-title>
          .
          <source>International Journal of Science and Research (IJSR) ISSN (Online)</source>
          , pages
          <fpage>2319</fpage>
          -
          <lpage>7064</lpage>
          2016.
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          [26]
          <string-name>
            <surname>Changye</surname>
            <given-names>Zhu</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>Baobin</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Ang</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>and Tingshao</given-names>
            <surname>Zhu</surname>
          </string-name>
          .: “
          <article-title>Predicting Depression from Internet Behaviors by TimeFrequency Features”</article-title>
          .
          <source>In Web Intelligence (WI)</source>
          ,
          <year>2016</year>
          IEEE/WIC/ACM International Conference on, pages
          <fpage>383</fpage>
          -
          <lpage>390</lpage>
          2016.
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          [27]
          <string-name>
            <surname>Xuetong</surname>
            <given-names>Chen</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Martin D Sykora</surname>
          </string-name>
          ,
          <string-name>
            <surname>Thomas W Jackson</surname>
            , and
            <given-names>Suzanne</given-names>
          </string-name>
          <string-name>
            <surname>Elayan</surname>
          </string-name>
          .: “What About Mood Swings:
          <article-title>Identifying Depression on Twitter with Temporal Measures of Emotions”</article-title>
          .
          <source>In Companion of the The Web Conference 2018 on The Web Conference</source>
          <year>2018</year>
          , pages
          <fpage>1653</fpage>
          -
          <lpage>1660</lpage>
          2018.
        </mixed-citation>
      </ref>
      <ref id="ref28">
        <mixed-citation>
          [28]
          <string-name>
            <surname>Zhao</surname>
            <given-names>Jianqiang</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gui Xiaolin</surname>
          </string-name>
          , and Zhang Xuejun.:
          <article-title>“Deep Convolution Neural Networks for Twitter Sentiment Analysis”</article-title>
          .
          <source>IEEE Access</source>
          ,
          <volume>6</volume>
          :
          <fpage>23253</fpage>
          -
          <lpage>23260</lpage>
          2018.
        </mixed-citation>
      </ref>
      <ref id="ref29">
        <mixed-citation>
          [29]
          <string-name>
            <given-names>Anees</given-names>
            <surname>Ul</surname>
          </string-name>
          <string-name>
            <surname>Hassan</surname>
          </string-name>
          , Jamil Hussain, Musarrat Hussain, Muhammad Sadiq, and
          <string-name>
            <given-names>Sungyoung</given-names>
            <surname>Lee</surname>
          </string-name>
          .: “
          <article-title>Sentiment Analysis of Social Networking Sites (SNS) Data Using Machine Learning Approach for the Measurement of Depression”</article-title>
          .
          <source>In Information and Communication Technology Convergence (ICTC)</source>
          , 2017 International Conference on, pages
          <fpage>138</fpage>
          -
          <lpage>140</lpage>
          2017.
        </mixed-citation>
      </ref>
      <ref id="ref30">
        <mixed-citation>
          [30]
          <string-name>
            <surname>Iram</surname>
            <given-names>Fatima</given-names>
          </string-name>
          , Hamid Mukhtar, Hafiz Farooq Ahmad, and Kashif Rajpoot.:
          <article-title>“Analysis of User-Generated Content from Online Social Communities to Characterise and Predict Depression Degree”</article-title>
          .
          <source>Journal of Information Science, page 0165551517740835</source>
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref31">
        <mixed-citation>
          [31]
          <string-name>
            <surname>George</surname>
            <given-names>Gkotsis</given-names>
          </string-name>
          , Anika Oellrich, Sumithra Velupillai, Maria Liakata, Tim JP Hubbard, Richard JB Dobson, and Rina Dutta.:
          <article-title>“Characterisation of Mental Health Conditions in Social Media using Informed Deep Learning”</article-title>
          .
          <source>Scientific reports</source>
          ,
          <volume>7</volume>
          :
          <fpage>45141</fpage>
          2017.
        </mixed-citation>
      </ref>
      <ref id="ref32">
        <mixed-citation>
          [32]
          <string-name>
            <surname>Long</surname>
            <given-names>Ma</given-names>
          </string-name>
          , Zhibo Wang, and Yanqing Zhang.: “
          <article-title>Extracting Depression Symptoms from Social Networks and Web Blogs via Text Mining”</article-title>
          .
          <source>In International Symposium on Bioinformatics Research and Applications</source>
          , pages
          <fpage>325</fpage>
          -
          <lpage>330</lpage>
          2017.
        </mixed-citation>
      </ref>
      <ref id="ref33">
        <mixed-citation>
          [33] Maupome´.: “
          <article-title>Using Topic Extraction on Social Media Content for the Early Detection of Depression” 2018.</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref34">
        <mixed-citation>
          [34]
          <string-name>
            <surname>Paul</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kalyani</surname>
            ,
            <given-names>J.S.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Basu</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <article-title>: “Early Detection of Signs of Anorexia and Depression Over Social Media using Effective Machine Learning Frameworks”</article-title>
          .
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref35">
        <mixed-citation>
          [35]
          <string-name>
            <surname>Trotzek</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Koitka</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Friedrich</surname>
            ,
            <given-names>C.M.</given-names>
          </string-name>
          ,:
          <article-title>“Word Embeddings and Linguistic Metadata at the CLEF 2018 Tasks for Early Detection of Depression</article-title>
          and Anorexia”
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref36">
        <mixed-citation>
          [36]
          <string-name>
            <surname>Kaibi</surname>
          </string-name>
          , Ibrahim, and Hassan Satori.:
          <article-title>“A comparative evaluation of word embeddings techniques for twitter sentiment analysis”</article-title>
          .
          <source>In 2019 International Conference on Wireless Technologies, Embedded and Intelligent Systems (WITS)</source>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>4</lpage>
          . IEEE 2019.
        </mixed-citation>
      </ref>
      <ref id="ref37">
        <mixed-citation>
          [37]
          <string-name>
            <surname>Ni</surname>
          </string-name>
          , Ru, and Huan Cao.: “
          <article-title>Sentiment Analysis based on GloVe and LSTMGRU”</article-title>
          .
          <source>In 2020 39th Chinese Control Conference (CCC)</source>
          , pp.
          <fpage>7492</fpage>
          -
          <lpage>7497</lpage>
          . IEEE 2020.
        </mixed-citation>
      </ref>
      <ref id="ref38">
        <mixed-citation>
          [38] Cheng, Yan, Leibo Yao, Guoxiong Xiang, Guanghe Zhang, Tianwei Tang, and Linhui Zhong.: “
          <article-title>Text sentiment orientation analysis based on multichannel CNN and bidirectional GRU with attention mechanism”</article-title>
          .
          <source>IEEE Access</source>
          <volume>8</volume>
          :
          <fpage>134964</fpage>
          -
          <lpage>134975</lpage>
          2020.
        </mixed-citation>
      </ref>
      <ref id="ref39">
        <mixed-citation>
          [39]
          <string-name>
            <surname>Goularas</surname>
          </string-name>
          , Dionysis, and Sani Kamis.:
          <article-title>“Evaluation of deep learning techniques in sentiment analysis from Twitter data”</article-title>
          .
          <source>In 2019 International Conference on Deep Learning and Machine Learning in Emerging Applications (Deep-ML)</source>
          , pp.
          <fpage>12</fpage>
          -
          <lpage>17</lpage>
          . IEEE 2019.
        </mixed-citation>
      </ref>
      <ref id="ref40">
        <mixed-citation>
          [40]
          <string-name>
            <surname>Abid</surname>
            , Fazeel,
            <given-names>Muhammad</given-names>
          </string-name>
          <string-name>
            <surname>Alam</surname>
          </string-name>
          , and Adnan Abid.:
          <article-title>“Representation of Words Over Vectors in Recurrent Convolutional Attention Architecture for Sentiment Analysis”</article-title>
          .
          <source>In 2019 International Conference on Innovative Computing (ICIC)</source>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>8</lpage>
          . IEEE 2019.
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>