<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>IRUGCN: A Graph Convolutional Network Rumor Detection Model Incorporating User Behavior⋆</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Shu Zhou</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Hao Wang</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Zhengda Zhou</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Haohan Yi</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Bin Shi</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Nanjing University</institution>
          ,
          <addr-line>163 Xianlin Road, Nanjing, Jiangsu, 210023</addr-line>
          ,
          <country country="CN">China</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>User Behaviour</institution>
          ,
          <addr-line>Graph Convolutional Network, Rumor Detection2</addr-line>
        </aff>
      </contrib-group>
      <abstract>
        <p>This paper introduces a novel rumor detection model for social media that enhances identification accuracy by incorporating user behavior alongside traditional user features. Utilizing graph convolutional networks for user representation, a recurrent neural network for analyzing propagation tree structures, and an integrator for merging these analyses, the model adeptly captures both user behaviors and the dynamics of rumor spread. Tested on Twitter15 and Twitter16 datasets, it achieved superior accuracy rates of 85.2% and 87.3%, respectively, outperforming existing models. Although the model currently does not differentiate interaction stances between users through weighted graph edges, its integration of user behavior marks a significant advancement in precise rumor detection.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        The proliferation of rumors on social media,
highlighted by their significant impact during events
such as the 2016 U.S. presidential election [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ],
presents challenges in terms of their identification
and containment due to the necessity of extensive
human resources and the potential for inaccuracies.
Addressing this, current research largely focuses on
deep learning-based rumor detection, emphasizing
content and user attributes, yet often overlooks the
critical aspect of user behavior patterns [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. This
study introduces the IRUGCN model, leveraging graph
convolutional networks to analyze user behavior
alongside traditional metrics for more effective rumor
detection. By integrating user behavior analysis with
content and propagation dynamics through a
sophisticated model comprising a user encoder, a
propagation tree encoder, and an integrator, the
IRUGCN demonstrates superior performance on
benchmark datasets like Twitter15 and Twitter16,
offering a promising approach to mitigating the
spread of misinformation on social media.
      </p>
    </sec>
    <sec id="sec-2">
      <title>2. Methodology</title>
      <sec id="sec-2-1">
        <title>2.1. Mission objective</title>
        <p>The methodology centers around enhancing rumor
detection in social media through the construction of
a dataset composed of tuples representing
declarations (tweets) and their associated users,
arranged to form propagation trees and a user
cooccurrence graph. The aim is to classify these tuples
into categories of rumors (non-rumor, false rumor,
true rumor, unconfirmed rumor) using a novel
detection model.</p>
      </sec>
      <sec id="sec-2-2">
        <title>2.2. Overall structure</title>
        <sec id="sec-2-2-1">
          <title>2.2.1. User Encoder</title>
          <p>Utilizes graph convolutional networks (GCN) to
encode user behavior and static characteristics into a
higher-order user representation. This involves
processing an undirected graph comprising users
linked based on their interactions, with adjacency
matrix adjustments reflecting the significance of these
interactions. The user features are represented as a
matrix, and the GCN updates node feature matrices to
integrate information from neighbor nodes.:</p>
        </sec>
        <sec id="sec-2-2-2">
          <title>2.2.2. Propagation tree structure encoder</title>
          <p>Employs bottom-up and top-down recurrent neural
network encoders to capture the structural and
semantic features of rumor propagation trees. The
bottom-up approach aggregates child node
representations to compute a parent node's
representation, capturing long-distance interaction
dependencies. Conversely, the top-down approach
considers the current node's features and its parent
node representation. The aim is to encode the
propagation tree's structure and semantics into a
vector representation.</p>
          <p>Integrator: Combines the output of the user encoder
and propagation tree encoder. It fuses the user
representation with the propagation tree
representation through a fully connected layer,
aiming to accurately predict the category of each
information statement by considering both user and
propagation tree information.</p>
        </sec>
        <sec id="sec-2-2-3">
          <title>2.2.3. Model training</title>
          <p>Focuses on minimizing the cross-entropy loss
between the model's predicted probability
distribution and the true labels, incorporating a
regularization term to balance the loss and prevent
overfitting.</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. Experimentation and analysis</title>
      <sec id="sec-3-1">
        <title>3.1. Experimental setup</title>
        <sec id="sec-3-1-1">
          <title>3.1.1. Data sets and evaluation indicators</title>
          <p>The study conducts experiments using Twitter
datasets (Twitter15 and Twitter16), comprising 1381
and 1181 propagation trees respectively, and
involving hundreds of thousands of users. These
datasets are categorized into non-rumor, false rumor,
true rumor, and unknown rumor, and split in a
9:0.5:0.5 ratio for training, validation, and test sets.
Model performance is evaluated based on overall
accuracy and class-specific F_1 scores.</p>
          <p>Implemented in Pytorch, the model employs
recurrent neural networks and graph convolutional
networks, with evaluation through 5-fold cross
validation. Key configuration parameters include a
256-dimensional word vector, 256-dimensional
hidden layers for user statistical features, behavioral
information, and the integrator module, and a
256sized batch with a 0.005 learning rate using the Adam
optimizer. Experiments leverage Python 3.7 on a
system with an NVIDIA Geforce RTX 2080 GPU. User
age data undergoes preprocessing to remove
unrealistic values.</p>
        </sec>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. Experimental results and analysis</title>
        <sec id="sec-3-2-1">
          <title>3.2.1. Comparative analysis of methods</title>
          <p>
            The experimental analysis validates the IRUGCN
model's performance in rumor detection on Twitter
datasets, comparing it with existing methods like
BERT[
            <xref ref-type="bibr" rid="ref3">3</xref>
            ], Transformer[
            <xref ref-type="bibr" rid="ref4">4</xref>
            ], RvNN[
            <xref ref-type="bibr" rid="ref5">5</xref>
            ], UMLARD[
            <xref ref-type="bibr" rid="ref6">6</xref>
            ], and
DDGCN[7]. IRUGCN, which incorporates both user
behavior and structural propagation features,
significantly outperforms these methods, with the
top-down encoder variant showing superior results.
          </p>
          <p>This highlights the importance of integrating user
behavioral data for accurate rumor identification.</p>
          <p>Comparative tests reveal that direct use of user
statistical features or fully connected layers for user
behavior analysis is less effective than the proposed
user encoder, especially on larger datasets,
underscoring the encoder's efficiency in capturing
complex user interactions.</p>
          <p>From the results in Table 2, we can see that our Transformer have relatively low accuracy because
model outperforms other models on these 2 datasets, they do not integrate many user features and other
especially TD-IRUGCN is more superior than BU- propagation features.</p>
          <p>IRUGCN.</p>
          <p>Compared with DDGCN and UMLARD methods, 3.2.2. Encoder Effectiveness Analysis
the model in this paper considers not only the Further analysis of the user encoder's effectiveness
statistical characteristics of users but also their highlighted its superiority over benchmark methods
behavioral information. Since the group behavior of relying solely on user's statistical features or using
users is more likely to spread rumors, the accuracy of fully connected layers for feature integration. The
rumor detection can be significantly improved when study revealed that the user encoder performs
user behavioral information is taken into account, exceptionally well on larger datasets, benefiting from
although it may also bring about other rumor rich user behavior and interaction patterns.
detection interferences. In contrast, BERT and</p>
          <p>Table 2 Performance of bottom-up propagation tree encoder combining different features on Twitter15
dataset</p>
          <p>Twitter15</p>
          <p>Non-rumor Fake rumor True Rumor</p>
        </sec>
        <sec id="sec-3-2-2">
          <title>3.2.3. Analysis of ablation experiments</title>
          <p>A series of ablation experiments were conducted to
dissect the contribution of various model components.
These experiments underscored the significance of
user behavioral features and revealed a comparative
advantage of the top-down propagation tree structure
encoder over the bottom-up approach, attributing this
to its early incorporation of global information.
An early rumor detection analysis further confirmed
IRUGCN's effectiveness, with the model
outperforming counterparts at critical early detection
time points, showcasing its potential in curbing rumor
spread proactively.</p>
        </sec>
        <sec id="sec-3-2-3">
          <title>3.2.5. Sample Tree Cases of False Rumor</title>
        </sec>
        <sec id="sec-3-2-4">
          <title>Spreading</title>
          <p>The study also includes a qualitative analysis of false
rumor propagation patterns through sample tree
cases, illustrating the dynamic interplay of support,
rebuttal, and skepticism in rumor spread. This
underlines the model's ability to capture complex
propagation dynamics, contrasting with traditional
models' limitations in grasping the depth of user
interactions and responses.</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Summarize</title>
      <p>This chapter outlines the development and validation
of an innovative graph neural network-based rumor
detection model that significantly enhances accuracy
and real-time performance by incorporating user
behavior. The model integrates three pivotal
components: a user encoder, a propagation tree
structure encoder, and an integrator, facilitating a
multi-dimensional analysis of rumors through
content, user, and propagation studies.</p>
      <p>Empirical tests on real-world datasets have
underscored the model's superiority in rumor
detection accuracy over existing methodologies,
laying a robust groundwork for future explorations.
Moving forward, the research will delve into a more
nuanced examination of inter-user interactions, such
as assigning weights to user graph edges based on
user interaction stances, to refine the model's
sensitivity towards intricate user relationships.
Additionally, the potential of transforming datasets
into graph structures is being considered to further
elevate the model's performance and versatility.</p>
    </sec>
    <sec id="sec-5">
      <title>Acknowledgements</title>
      <p>This paper is supported by the National Natural
Science Foundation of China under contract No.
72074108, Special Project of Nanjing University
Liberal Arts Youth Interdisciplinary Team
(010814370113), Jiangsu Young Social Science
Talents, and Tang Scholar of Nanjing University.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <surname>Rahim</surname>
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Rumor</surname>
          </string-name>
          <article-title>Identification on Twitter Data for 2020 US Presidential Elections with BERT Model[J]</article-title>
          .
          <source>UMT Artificial Intelligence Review</source>
          ,
          <year>2021</year>
          ,
          <volume>1</volume>
          (
          <issue>1</issue>
          ):
          <fpage>1</fpage>
          -
          <lpage>1</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <surname>Gumaei</surname>
            <given-names>A</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Al-Rakhami</surname>
            <given-names>M S</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hassan</surname>
            <given-names>M M</given-names>
          </string-name>
          , et al.
          <article-title>An Effective Approach for Rumor Detection of Arabic Tweets Using eXtreme Gradient Boosting Method[J]</article-title>
          .
          <source>ACM Transactions on Asian and LowResource Language Information Processing</source>
          ,
          <year>2022</year>
          ,
          <volume>21</volume>
          (
          <issue>1</issue>
          ):
          <fpage>1</fpage>
          -
          <lpage>16</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <surname>Devlin</surname>
            <given-names>J</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chang</surname>
            <given-names>M W</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lee</surname>
            <given-names>K</given-names>
          </string-name>
          , et al.
          <article-title>BERT: Pretraining of Deep Bidirectional Transformers for Language Understanding[M]</article-title>
          . arXiv,
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <surname>Vaswani</surname>
            <given-names>A</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Shazeer</surname>
            <given-names>N</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Parmar</surname>
            <given-names>N</given-names>
          </string-name>
          , et al. Attention Is All You Need[M]. arXiv,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <surname>Ma</surname>
            <given-names>J</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gao</surname>
            <given-names>W</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wong K F. Rumor</surname>
          </string-name>
          <article-title>Detection on Twitter with Tree-structured Recursive Neural Networks[C]//Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)</article-title>
          . Melbourne, Australia: Association for Computational Linguistics,
          <year>2018</year>
          :
          <fpage>1980</fpage>
          -
          <lpage>1989</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <surname>Chen</surname>
            <given-names>X</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zhou</surname>
            <given-names>F</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Trajcevski</surname>
            <given-names>G</given-names>
          </string-name>
          , et al.
          <article-title>Multi-view learning with distinguishable feature fusion for rumor detection</article-title>
          [J].
          <source>Knowledge-Based Systems</source>
          ,
          <year>2022</year>
          ,
          <volume>240</volume>
          :
          <fpage>108085</fpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>