<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>YETI at FakeDeS 2021: Fake News Detection in Spanish with ALBERT</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Hongxin Luo</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>School of Information Science and Engineering Yunnan University</institution>
          ,
          <addr-line>Yunnan</addr-line>
          ,
          <country country="CN">P.R. China</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>This paper explains our participation in the IberLEF2021 shared task 7: Fake News Detection Task. The goal of this task is to analyze a corpus of Spanish News and determine the authenticity of its content. The threat of false information is designed to negatively a ect people, by disseminating information that does not match the facts, so that users can accept biased or erroneous information. Therefore, fake news becomes particularly important. For this task, this paper mainly discusses di erent methods of fake news detection. We chose the ALBERT Model. On this basis we made a simple modi cation to the upper structure of the ALBERT model. In the end, our system got 63.16 % F1 score in the task. Although our proposal did not reach the best, it provides a new idea for fake news detection.</p>
      </abstract>
      <kwd-group>
        <kwd>ALBERT</kwd>
        <kwd>Fake News Classi cation</kwd>
        <kwd>Natural Language Processing</kwd>
        <kwd>Deep-learning</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>Many years ago, the main channels for us to obtain news and information were
television and newspapers. In recent years, with the rise of the mobile Internet,
more and more people are choosing to obtain news information from social media.
But the quality of news on social media is far lower than traditional media. Since
anyone can easily publish a news article on social media, the quality of articles
on social media is uneven, and there are even a lot of fake news.</p>
      <p>
        The threat of false information is designed to negatively in uence people and
deliberately persuade users to accept biased or erroneous information. Therefore,
detecting fake news on social media becomes particularly important. This is the
mission of IberLEF2021 [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], the forum aims to encourage research on social media
content analysis in Spanish [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ][
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]. In this work, we explored the task of fake news
detection in IberLEF2021 from the perspective of deep learning [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. This task can
be regarded as a binary classi cation in Spanish. The corpus consists of news
compiled mainly from Mexican web sources: established newspaper websites,
media company websites, special websites dedicated to validating fake news,
websites designated by di erent journalists as sites that regularly publish fake
news. The news was collected from January to July of 2018 and all of them
were written in Mexican Spanish [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]. There are a total of 971 news items in the
corpus.
      </p>
      <p>
        We used several di erent neural network models for comparison, such as
convolutional neural network (TextCnn) [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ], fast text classi er (fastText) [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]
and a little BERT for self-supervise learning of language representations model
(ALBERT) [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. For the given data set in the task, we found that ALBERT
performed best on our validation set. Therefore, we accomplish this task by
using the ALBERT model.
      </p>
      <p>The rest of this paper is organized as follows. Chapter 2 brie y introduces
related work. Chapter 3 introduces our method in detail, including the
description of the data set, data preprocessing and architecture. Chapter 4 outlines the
evaluation process. Finally, Chapter 5 summarizes our work.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Related Work</title>
      <p>
        IberLEF is an Iberian language evaluation forum for NLP tasks. In the 2020
version of fake news detection, participants have proposed a variety of methods,
from traditional machine learning to deep learning, such as BoW, n-grams,
Neural Networks, Transformers, etc. [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ][
        <xref ref-type="bibr" rid="ref8">8</xref>
        ][
        <xref ref-type="bibr" rid="ref13">13</xref>
        ]. According to the author's analysis,
the best results are obtained using the Supervised Autoencoder (SAE) method,
which is a neural network that learns the representation (encoding) of the
input data and then learns to reconstruct the original input [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ]. They use three
di erent types of features as input representation: word n-grams, char n-grams
and BETO encodings [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ]. In the previous version, they used the supervised
automatic coding method to get good results [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ].
      </p>
      <p>
        Detecting fake news on social media is becoming more and more
important. To build an e ective classi er, one of the most important problems is to
nd suitable input features. Generally, there are two types of features that are
widely used: one is a surface feature, such as n-grams [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ], and the other is a
word representation trained by a neural network, such as skip-grams. General
classi ers use traditional machine learning methods, such as support vector
machines, random forests, logistic regression, etc., to train for di erent types of
tasks. In many NLP tasks, it is e ective to use pre-trained word embeddings
to extract features [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ][
        <xref ref-type="bibr" rid="ref15">15</xref>
        ]. The word embedding model is extracted from a
shallow neural network, which requires the neural network to be obtained by
training a large amount of text data, it can learn the contextual representation
of words, such as skip-grams and GloVe [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]. But these word embeddings are
learned from all possible words, which makes the word embedding may cover
up the nuances of semantics. However, transformer-based language models, such
as OpenAI Generative Pre-trained Transformer (GPT) [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ] and BERT [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] have
been extended to a depth of as much as 12 layers. ALBERT uses techniques such
as parameter sharing and matrix decomposition to greatly reduce model
parameters [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. ALBERT can greatly improve the level of language models. It can learn
a good feature representation for words by running an unsupervised language
representation learning algorithm based on a massive corpus. The so-called
selfsupervised learning means that there is no human Supervised learning running
on labeled data. Compared with ELMo and GPT, the pre-trained ALBERT
model has achieved good results in a series of NLP tasks [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ].
3
3.1
      </p>
    </sec>
    <sec id="sec-3">
      <title>Methodology</title>
      <sec id="sec-3-1">
        <title>Datasets</title>
        <p>
          The Spanish news corpus was collected from January to July 2018, all written in
Mexican Spanish. They are news aggregated from several online sources: existing
newspaper websites, media company websites, websites that specialize in
verifying fake news, and websites designated by di erent reporters to publish fake
news on a regular basis. There are 971 news items in the aggregated corpus. The
news includes 9 di erent types of news topics, making the corpus as balanced as
possible. The number of fake news and real news is also roughly balanced. In the
data set, 676 pieces of data are used as the training data set and 295 pieces are
used as the validation data set. The ratio of the training set to the validation
set is about 7:3 [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ].
3.2
        </p>
      </sec>
      <sec id="sec-3-2">
        <title>Pre-processing</title>
        <p>
          Although deep learning methods can learn the main features from the data, the
output performance of the model also depends on the expected quality of the
input training data set and 295 pieces are used as the validation data set. The
ratio of the training set to the validation set is about 7:3 [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ]. Data preprocessing
can remove the noise data in the input data to improve the performance of the
model. For the model we used, we have performed the basic preprocessing of the
data as follows:
1. Convert the input text to lowercase.
2. Remove punctuation marks.
3. Delete numeric characters.
4. Delete the stop-words.
        </p>
        <p>
          We removed information that was not useful for model extracting features.
We used the Natural Language Toolkit (NLTK) to complete the stop-word
removal step. In the experiment, we use 5-fold cross-validation to control the order
of data in each batch. For each fold of the data set, the input data format is
[CLS]+sentence+[SEP]([CLS], which are used to separate each sample, [SEP],
which are used to separate di erent sentences in the sample). The pre-trained
model is loaded from the ALBERT-base-V2 model [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ]. In V2 version, ALBERT
apply 'no dropout', 'additional training data' and 'long training time' strategies
to all models. ALBERT-base is trained for 10M steps and other models for 3M
steps.
3.3
The research trend in the NLP eld is to use larger and larger models to
obtain better performance, the depth of the network can improve the results of
the model [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ][
          <xref ref-type="bibr" rid="ref2">2</xref>
          ]. Research on ALBERT shows that t blindly stacking model
parameters may reduce the e ect., and memory and training speed will also
be hindered. ALBERT solves this problem by designing a Lite BERT
architecture, which has fewer parameters than the traditional BERT architecture [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ].
ALBERT is \A Lite" version of BERT, a popular unsupervised language
representation learning algorithm. ALBERT uses parameter-reduction techniques
that allow for large-scale con gurations, overcome previous memory limitations,
and achieve better behavior with respect to model degradation. ALBERT uses
parameter sharing, matrix decomposition and other technologies to greatly
reduce model parameters, and at the same time replaces NSP (Next Sentence
Prediction) Loss with SOP (Sentence Order Prediction) Loss to improve the
performance of downstream tasks. The reduction of parameters can make
training faster [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ]. The structure of ALBERT is basically the same as BERT, and
there are three speci c improvements. Embedding layer parameter
factorization, cross-layer parameter sharing, NSP task is changed to SOP task. We use
albert to ne-tune on the training data set.
3.4
        </p>
      </sec>
      <sec id="sec-3-3">
        <title>Method</title>
        <p>
          In classi cation task, the output of ALBERT-Base(pooler output) is obtained
by its last layer hidden state of the rst token of the sequence (CLS token)
further processed by a linear layer and a Tanh activation function. Because the
pooler output cannot summarize the input semantic content well, and studies
have shown that the top layer of Bert can learn richer semantic information
features [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ]. We try to modify the ALBERT model to obtain rich semantic
information features. We pass Spanish News directly to the ALBERT model,
and we concatenate H0 (H0 is hidden-state of the rst token of the sequence
(CLS token) at the output of the hidden layer of the model.) of the last three
hidden layers into the classi er.We call this method ALBERT Classi er.
4
        </p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Results</title>
      <p>Task 7 is to detect fake news, and fake news detection solution will be ranked
by the F1 measure on the \Fake" class.</p>
      <p>
        In our work, the implementation of all models is based on TensorFlow, and
the pre-trained models are cased. Due to the limitation of personal GPU memory,
the batch size and max-seq-length in the ne-tuning stage were adjusted
according to the memory capacity in order to achieve the best results. The optimizer
used in the model in this experiment is Adam [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. Table 1 shows the
hyperparameters of each model on the validation data set of the fake news detection task
and the results of the model on the validation data set.
In this paper, we introduce the method of participating in the sharing task of
Spanish fake news detection organized by IberLEF2021. We propose to
modify the upper structure of the ALBERT model and use the ALBERT-Base-V2
pre-trained model for training. The experiment uses 5-fold cross-validation.
Finally, we get the nal result through hard voting. In future work, we hope to
explore more e ective data preprocessing methods and use data augmentation
to make the model perform better, and improve our results in the next IberLEF
competition.
      </p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Aragon</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Jarqu n</surname>
          </string-name>
          , H.,
          <string-name>
            <surname>Gomez</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <article-title>M.y</article-title>
          .,
          <string-name>
            <surname>Escalante</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <article-title>Villasen~or-</article-title>
          <string-name>
            <surname>Pineda</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gomez-Adorno</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bel-Enguix</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Posadas-Duran</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          :
          <article-title>Overview of mex-a3t at iberlef 2020: Fake news and aggressiveness analysis in mexican spanish</article-title>
          .
          <source>In: Notebook Papers of 2nd SEPLN Workshop on Iberian Languages Evaluation Forum (IberLEF)</source>
          , Malaga, Spain (
          <year>2020</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Devlin</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chang</surname>
            ,
            <given-names>M.W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lee</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Toutanova</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          :
          <article-title>Bert: Pre-training of deep bidirectional transformers for language understanding</article-title>
          . arXiv preprint arXiv:
          <year>1810</year>
          .
          <volume>04805</volume>
          (
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Gomez-Adorno</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Posadas-Duran</surname>
            ,
            <given-names>J.P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bel-Enguix</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Porto</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          :
          <article-title>Overview of fakedes task at iberlef 2020: Fake news detection in spanish</article-title>
          .
          <source>Procesamiento del Lenguaje Natural</source>
          <volume>67</volume>
          (
          <issue>0</issue>
          ) (
          <year>2021</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Joulin</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Grave</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bojanowski</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mikolov</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          :
          <article-title>Bag of tricks for e cient text classi cation</article-title>
          .
          <source>arXiv preprint arXiv:1607.01759</source>
          (
          <year>2016</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Kingma</surname>
            ,
            <given-names>D.P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ba</surname>
          </string-name>
          , J.:
          <article-title>Adam: A method for stochastic optimization (</article-title>
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Kotsiantis</surname>
            ,
            <given-names>S.B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kanellopoulos</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pintelas</surname>
            ,
            <given-names>P.E.</given-names>
          </string-name>
          :
          <article-title>Data preprocessing for supervised leaning</article-title>
          .
          <source>International Journal of Computer Science</source>
          <volume>1</volume>
          (
          <issue>2</issue>
          ),
          <volume>111</volume>
          {
          <fpage>117</fpage>
          (
          <year>2006</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Lan</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chen</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Goodman</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gimpel</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sharma</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Soricut</surname>
            ,
            <given-names>R.:</given-names>
          </string-name>
          <article-title>A lite bert for self-supervised learning of language representations</article-title>
          . arXiv preprint arXiv:
          <year>1909</year>
          .
          <volume>11942</volume>
          (
          <year>2019</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Liu</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Demirel</surname>
            ,
            <given-names>M.F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Liang</surname>
          </string-name>
          , Y.:
          <article-title>N-gram graph: Simple unsupervised representation for graphs, with applications to molecules</article-title>
          . arXiv preprint arXiv:
          <year>1806</year>
          .
          <volume>09206</volume>
          (
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Pennington</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Socher</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Manning</surname>
          </string-name>
          , C.D.: Glove:
          <article-title>Global vectors for word representation</article-title>
          .
          <source>In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP)</source>
          . pp.
          <volume>1532</volume>
          {
          <issue>1543</issue>
          (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Posadas-Duran</surname>
            ,
            <given-names>J.P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gomez-Adorno</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sidorov</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Escobar</surname>
            ,
            <given-names>J.J.M.:</given-names>
          </string-name>
          <article-title>Detection of fake news in a new corpus for the spanish language</article-title>
          .
          <source>Journal of Intelligent &amp; Fuzzy Systems</source>
          <volume>36</volume>
          (
          <issue>5</issue>
          ),
          <volume>4869</volume>
          {
          <fpage>4876</fpage>
          (
          <year>2019</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Radford</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wu</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Child</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Luan</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Amodei</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sutskever</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          :
          <article-title>Language models are unsupervised multitask learners</article-title>
          .
          <source>OpenAI blog 1(8)</source>
          ,
          <volume>9</volume>
          (
          <year>2019</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Rakhlin</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Convolutional neural networks for sentence classi cation</article-title>
          .
          <source>GitHub</source>
          (
          <year>2016</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Vaswani</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Shazeer</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Parmar</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Uszkoreit</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Jones</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gomez</surname>
            ,
            <given-names>A.N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kaiser</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Polosukhin</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          :
          <article-title>Attention is all you need</article-title>
          .
          <source>arXiv preprint arXiv:1706.03762</source>
          (
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <surname>Villatoro-Tello</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ram</surname>
          </string-name>
          rez-de-la
          <string-name>
            <surname>Rosa</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kumar</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Parida</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Motlicek</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          :
          <article-title>Idiap and UAM participation at mex-a3t evaluation campaign</article-title>
          .
          <source>In: Notebook Papers of 2nd SEPLN Workshop on Iberian Languages Evaluation Forum (IberLEF)</source>
          , Malaga, Spain (
          <year>2020</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <surname>Wang</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Peng</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zhang</surname>
            ,
            <given-names>X.</given-names>
          </string-name>
          :
          <article-title>Using a stacked residual lstm model for sentiment intensity prediction</article-title>
          .
          <source>Neurocomputing</source>
          <volume>322</volume>
          ,
          <issue>93</issue>
          {
          <fpage>101</fpage>
          (
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <surname>Wang</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Yu</surname>
            ,
            <given-names>L.C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lai</surname>
            ,
            <given-names>K.R.</given-names>
          </string-name>
          , Zhang,
          <string-name>
            <surname>X.</surname>
          </string-name>
          :
          <article-title>Community-based weighted graph model for valence-arousal prediction of a ective words</article-title>
          .
          <source>IEEE/ACM Transactions on Audio, Speech, and Language Processing</source>
          <volume>24</volume>
          (
          <issue>11</issue>
          ),
          <year>1957</year>
          {
          <year>1968</year>
          (
          <year>2016</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>