=Paper=
{{Paper
|id=Vol-3745/paper5
|storemode=property
|title=Unveiling the Secret of Information Rediffusion Process on Social Media from Information Coupling Perspective: a Hybrid Approach of Machine Learning and Regression Model
|pdfUrl=https://ceur-ws.org/Vol-3745/paper5.pdf
|volume=Vol-3745
|authors=Zhen Yan,Rong Du,Hua Wang
|dblpUrl=https://dblp.org/rec/conf/eeke/YanDW24
}}
==Unveiling the Secret of Information Rediffusion Process on Social Media from Information Coupling Perspective: a Hybrid Approach of Machine Learning and Regression Model==
Unveiling the secret of information rediffusion process on social media from information coupling perspective: a hybrid approach of machine learning and regression model1 Zhen Yan1,∗, Rong Du2 and Hua Wang1,∗ 1 Xi’an Jiaotong University, Shaanxi 710049 Xi’an, China 2 Xidian University, Shaanxi 710126 Xi’an, China Abstract Given the popularity and prevalence of communication through social media platforms, it is critical to determine the mechanisms that diffuse and rediffuse information. Prior studies have examined the impacts of a range of news item characteristics on the spread of information. However, little research has yet explored the influence that information coupling might have on the commenting and reposting behavior of users. Using the Sina Microblog site, we modeled three information couplings – emotional coupling, semantic coupling, and cognitive coupling – to determine whether they have any influence on the spread of information. We also examined whether opinion leaders wield a moderating influence in these relationships. Building on the cardinal literature and theories, we find that emotional and semantic coupling contributes more to commenting, whereas cognitive and emotional coupling both influence reposting more. Both these findings are supported by construal- level theory. Opinion leaders have a positive correlation with reposting, which is also supported by two-step flow theory. Overall, this research deepens our present understanding of information rediffusion at the comment and reposting levels. Our findings highlight the importance of considering information coupling from a linguistic point of view and of considering the influence of opinion leaders. This research also opens up interesting opportunities for further study on the role that information coupling might play given a comprehensive view of user-generated content (UGC). The outcomes of this study should help social media platforms and their users better understand how information spreads on social media. Keywords information coupling, two-fixed model, construal-level theory, two-step flow theory, information rediffusion 1. Introduction information more quickly (Wang et al., 2022). The Sina Microblog, one of the world’s biggest social media platforms, was an important and popular form of In the post-internet era, communicating through social media has become a ubiquitous part of daily life. This human-media interaction during the pandemic and has continued to be so ever since. There is no doubt that not only gives rise to massive amounts of information more sensitive to public health information, they have social technologies and constantly evolving internet technologies are transforming information diffusion, also become more likely to get information about public health emergencies from social media (Becker & rediffusion, and the way people acquire information and knowledge. It is therefore paramount to explore the Gijsenberg 2022). This is because they believe that factors that influence these rediffusion processes and information sharing and communicating with others will provide them with more up-to-date and transparent Joint Workshop of the 5th Extraction and Evaluation of Knowledge Entities from Scientific Documents (EEKE2024)and the 4th Al+ Informetrics (ALL2024),April 23-24,2024, Changchun, Jilin, China and Online ∗ Corresponding author. jessieyan92@163.com (Z. Yan); durong@mail.xidian.edu.cn (R. Du); seablue@xjtu.edu.cn (H. Wang) © 2023 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). CEUR ceur-ws.org Workshop ISSN 1613-0073 Proceedings 45 the mechanisms by which the coupling of information 2. Theoretical background and content and context influence the process. Some scholars have studied information diffusion Conceptual model processes from the perspective of user behavior, such as information sharing (Fu & Shen, 2014), reactions to 2.1 Summarization of theoretical information (Kim et al., 2023), and interactions with background information (Jensen et al., 2013), while others have Overall, prior studies have extensively studied the studied the content of information, including the paradigm of networks and the motivations behind UGC emotions conveyed (Naskar et al., 2020) and the topics and user behavior in the information diffusion process. discussed (Chen et al., 2020;Kim et al., 2023). According Some scholars have developed algorithms based on to Chen et al. (2020), two main online behaviors information propagation theory, such as the SIR model influence information diffusion through social networks: (Xu et al., 2020; Harrigan et al., 2021), while others have commenting and reposting. Commenting provides used technical means to reveal any emotional platforms and sources of information rediffusion while influences at play (Singh et al., 2020; Chen et al., 2020; reposting facilitates information rediffusion because of Diwali et al., 2023). However, information couplings the structure of the Internet. comprising the origin of information with UGC has Information coupling, as an association of topically received less attention as has the contribution such related documents for managing and manipulating couplings make to the information diffusion process. coupled information extracted from the database Our review indicates that specific user activities along (Bhowmick et al., 1998), refers to the degree of with the content of the information to be spread have difference between information source and the User- the greatest influence over whether the informationwill generated-content (UGC), the content that is created by be disseminated. members of the general public and distributed over the 2.2 Conceptual model of the present work internet (Daugherty et al. 2008, Krumm et al. 2008), in Drawing insights from the previous literature, the the present study. Information coupling also has been impact of information rediffusion is reflected in the studied from content-congruence and topic consistency total sum of comments and reposts. Given the structure aspects, respectively (Peng et al., 2020; Kim et al., 2023). of social networks, more comments should attract However, we have very little knowledge on how greater user attention, while more reposts should information coupling influences information rediffusion expand the sphere of exposure. In other words, reposts process, which arouses and promotes information spread attention wider and further while comments rediffusion extremely, is neglected. To fill this research increase the level of scrutiny given to some news (Shiau gap, this study concentrates on the factors that et al., 2017). influence the information rediffusion process from the In addition, the information rediffusion mechanism perspective of information coupling, i.e. the difference is also stimulated by information coupling. Emotions between the information source (hereto as the news) and topics, the most significant aspects of information and the UGC. There are three main research questions content, reveal personal attitudes (Qiao et al., 2022; Yin we seek to answer: et al., 2023). As mentioned, emotional couplings refer Research Questions 1: How does information coupling to the similarity of the feelings in an information source influence information rediffusion in terms of and its associated UGC. Here, extreme UGC is usually commenting? associated with intense emotions, and therefore may Research Questions 2: How does information coupling contain incoherent arguments (Yin et al., 2023). Indeed, influence information rediffusion in terms of reposting? to express strong case for or against an information Research Questions 3: How do opinion leaders affect source, an incentivized user needs to deliver a information rediffusion? particularly coherent argument that covers many To answer these research questions, we designed a details, thus giving rise to semantic meaning. For this moderated nonlinear model as a way of exploring which reason, we therefore assume that both emotional and factors influence the information rediffusion process semantic coupling influence information rediffusion. and how. The empirical setting for this study is news of Further, due to individual differences in cognition, the public health emergencies and the UGC associated with cognitive influence of some news also plays an this news, crawled from the Sina Microblog. These important role in delivering information. Metaphor, as difference between the two types of information – the surface expression of cognition, is regarded as news and UGC – form the information coupling. Our cognitive coupling, which is also one of the independent research exerts efforts on the information coupling variables in this study. from sematic, typology, and cognition perspectives, However, the structure of social networks means employs a two-way fixed moderated nonlinear model that information diffusion will also depend on the (i.e., comment-fixed effect model and repost-fixed relationships between users. These relationships effect model). 46 directly influence information diffusion but opinion 4. Results and Findings leaders, who have large numbers of followers, also indirectly influence the information rediffusion process. Therefore, opinion leaders, as one important facet of 4.1 Comment model social networks, is a moderating variable in this study. Table 2 presents the results of the main regressions used The control variables include gender, whether the user to test the effects of the three types of couplings on is verified, the number of posts the user has made on information rediffusion. Note that we standardized all the platform, and the number of users a user is continuous independent variables to leverage the following (László et al., 2023; Lin et al., 2022; Liu et al., comparison of effect sizes. We first entered the control 2023). For the whole view of the conceptual model for variables in Model 1 and then added the three coupling this study, we illustrate it on Fig.1 on Appendix 1. variables and the moderate variable to Models 2-5 in a stepwise fashion. We then compared the R2 of Models 3. Methodology 2-5 with Model 1, which was taken as the baseline model, and found that adding the three coupling 3.1 Overview of the research framework variables along with the moderating variable significantly improved the model’s fit (p<0.001). Our dataset, which comprises 4,017 pieces of news and Model 2, which includes all the control variables, 416,358 pieces of UGC was crawled from Sina Microblog. tests the influence of emotional coupling The period of study is 1 Dec 2021 to 1 Jun 2022, All of (M=1.084,SD=0.557). The correlation shows that the news relates to public health emergencies because emotional coupling attracts more comments (β1= this type of news is particularly interesting to the public 1.007**), which induces that when the difference (Li et al., 2020).Then we removed the several words between UGC and the news on emotional intensity UGC and resaved 415,473 pieces of UGC (i.e. remove increase at 1, the one comment of the UGC is added. repeated data and symbol-only data and Jieba word Thus, emotional intensity has a positive effect on split). information rediffusion at the comment level. Model 3, As discussed in the literature review, we drew the which tests semantic coupling, shows that this type of factors for study from the literature. We modelled coupling is also positively related to information emotional coupling, semantic coupling, and cognitive rediffusion at the comment level (β2= 0.667***, p < coupling using a machine learning approach and 0.001). This result indicates that a great similarity negative binominal regression models to measure the between the news and the UGC on semantic level will influence of these factors on information rediffusion. significantly increase the number of comments made The influence of opinion leaders was modelled as a against the item. Model 4, which tests the influence of moderating effect (Wang et al., 2022). Finally, we cognitive coupling on comments, also indicates a conclude the working mechanism of information positive correlation. Thus, the more cognitively similar rediffusion and apply them on management practice. the news and the UGC, the more comments the item will Details follow in Figure 2 on Appendix 1. attract (β3= 0.637*** ,p < 0.001). Opinion leaders, as a moderating variable, also have a positive effect on 3.2 Variables description and measurement comments (β4= 0.227* ,p < 0.05). We took comments and reposts as our dependent variables, while the independent variables are 4.2 Repost model emotional coupling, semantic coupling, and cognitive The results of the negative binominal model tests to coupling. The influence of opinion leaders was modelled assess how the variables influence reposting behavior as a moderating variable. Opinion leaders were defined are shown in Table 3. Model 6 contains the control as those with more than 10,000 followers and Big V variables and is regarded as the baseline of the badge on the Sina Microblog. Table 1 in Appendix 1 reposting model. Compared to Model 1 in Table 2, shows the definitions, formulas and measurement Model 6 demonstrates that gender and whether the metrics for each variable. user is verified contributes more significantly to We devised two fixed models to estimate the two reposting than to comments (β5= 0.857** ,p < 0.01). different dependent variables, i.e., a commenting Models 7-10 portray the stepwise regressions for model and a reposting model. All of the dependent the independent and moderating variable. In Model 7, variables were measured in terms of frequency. emotional coupling is shown to have a positive influence All the measurements of variables are illustrated on on reposting (β1= 946**,p < 0.01), indicating that Appeendix 1. differences in emotional coupling attract more frequent reposts. Semantic coupling also significantly affects reposting, as indicated by Model 8 (β2= 0.417*** ,p < 0.001), while cognitive coupling also significantly 47 influences reposting behavior as demonstrated by the shown by the green curve in Fig. 3 , which fluctuates results from Model 9 (β3= 0.668***,p < 0.001). The dramatically. This is consistent with previous findings moderating variable, opinion leaders, has a greater (Yin et al., 2023) and is supported by cognitive positive influence on reposting than it does on dissonance theory (Festinger, 1962). Cognitive commenting (β4= 3.388**,p < 0.01), as shown by Model dissonance refers to the psychological state of 10 (Table 3) when compared to Model 5 (Table 2). This discomfort or stress triggered by factors such as phenomenon explicitly displays the “nudge” effect of contradictory information in the environment, or the opinion leaders in social network as two-step flow inconsistency of one’s beliefs with their actions or new theory posits. information. Individuals realize that it’s difficult to process self-contradictory information (Alter & 4.3 Moderating factors Oppenheimer, 2009) which is always presented as less attention paid. Fig.3 portrays the sentiment polarity of In terms of the moderating effect of opinion leaders the news (the blue color curve), UGC (the red color between information coupling and rediffusion, the data curve), and their difference (the green color curve). It indicate that the interactions of opinion leaders with shows that when the difference of news and UGC in emotional coupling, semantic coupling, and cognitive emotion intensity fluctuates largely, the emotional coupling are significantly correlated with each other intensity of UGC changes largely as well. The sentiment (see Model 11 of Table 4 and Model 12 of Table 4). polarity of the different shows that contradictory Models 11 and 12 also demonstrate that opinion directly contribute to the increase of cognitive leaders exert a different influence over commenting dissonance in the evaluation of the same attributes behavior to reposting. Opinion leaders will attract a among different information content. At the same time, greater number of comments through emotional the polarity of emotional intensity always accompanied intensity (β1= 2.317***, p < 0.001) and relying on with less frequency of comments or repost. Therefore, cognitive expressions (β3= 2.304***, p < 0.001). our results suggest that as the difference in emotional However, to attract more reposts, opinion leaders need intensity becomes larger, as supported by cognitive to motivate users through semantic content (β2= dissonance theory, it negatively influences how UGC is 2.359***, p < 0.001) and, again, cognitive expressions perceived as manifest by lower numbers of comments (β3= 2.707***, p < 0.001). Overall, similarity in and reposts. metaphorical expression is the most important factor in The interaction effects of opinion leaders with three an opinion leader receiving comments and reposts on types of information coupling also represent a social media. prominent cue that opinion leaders positively influence the number of comments mainly through expressing 5 Conclusion and implication intense emotions, which can shape others’ thinking and mindsets. However, using different metaphorical The overarching conclusions from this research are that expressions, especially converse metaphors helps emotional and semantic coupling prompt information opinion leaders to attract more reposts. More rediffusion through comments, while reposting typically specifically, spatial metaphor, such as up, increase, depends on emotional and cognitive coupling. Further, support, is always bound to down, doubt of the facts, opinion leaders contribute more to reposting behavior bottom in UGC of opinion leaders which receives more than to commenting. Compared to previous studies, the repost. For examples, the number of patients always specific contributions of this study can be summarized described as extremely higher with less treatment, as follows. which portrays an opposite picture in public health Although previous studies on the diffusion of emergencies and reaches more comments and reposts. information report that content needs to be written in a Besides, the structural metaphor “the pandemic is a war” certain way or placed in a certain context in order to be is used to map the public health emergencies to war, perceived easily by others, emerging evidence from B2C thus many expressions on war is used to described the platforms suggests that the concreteness of lexical cues emergencies. The doctors and nurses are described as can influence the beliefs and mindsets of users as they soldiers and heroes, which provides a more specific read and make sense of UGC (Peng et al. 2020; Jörg et picture of the fierce situation in public health al., 2023). However, few of these studies have examined emergencies. This type of metaphors used by opinion the cognitive cues underlying content at the lexical level. leaders is attracted more comments or reposts as well. Building on and going beyond recent studies, we applied metaphorical expressions, the linguistic surface of cognition, to determine the effect of cognitive coupling. In theory, Figure 3 in appendix shows that the difference in emotional intensity between a piece of news and some UGC is a highly significant factor as 48 References: [1]. Alter AL, Oppenheimer DM (2009). Uniting the tribes Melbourne, Australia. Association for Computational of fluency to form a metacognitive nation. Personality Linguistics. Soc. Psych. Rev. 13(3), 219–235. [14]. Peng Chih-Hung , Yin Dezhi & Zhang Han. (2020). [2]. Becker, M., & Gijsenberg, M.J. (2022). Consistency and More than Words in Medical Question-and-Answer Commonality in Advertising Content: Helping or Sites: A Content-Context Congruence Perspective. Hurting? International Journal of Research in Information Systems Marketing. Research.https://doi.org/10.1287/isre. [31]. [3]. Bhowmick Sourav ; Ng, Wee-Keong & Lim Ee Peng. [15]. Qiao Dandan, Huaxia Rui (2022) Text Performance on Information coupling in web databases. (1998). the Vine Stage? The Effect of Incentive on Product Conceptual Modeling - ER ’98: 17th International Review Text Quality.Information Systems Research. Conference on Conceptual Modelling, Singapore, https://doi.org/10.1287/isre.1146. 1619 November 1998: Proceedings, 1507, 92-106. [16]. Stieglitz Stefan & Linh Dang-Xuan. (2013). Emotions [4]. Chen Yanzhen, Rui, Huaxia & Whinston Andrew B. and Information Diffusion in Social Media— (2021). Tweet to the Top? Social Media Personal Sentiment of Microblogs and Sharing Behavior. Branding and Career Outcomes. MIS Journal of Management Information Systems, 29(4): Quarterly,45(2):499-534. 217–247. [5]. Diwali, K. Saeedi, K. Dashtipour, M. Gogate, E. Cambria [17]. Thaler, R. H., & Sunstein, C. R. (2009). Nudge: & A. Hussain. Sentiment Analysis Meets Explainable Improving decisions about health, wealth, and Artificial Intelligence: A Survey on Explainable happiness. Penguin. Sentiment Analysis. in IEEE Transactions on Affective [18]. Wang, X., Zhang, M.,Fan, W., & Zhao, K. (2022). Computing, doi: 10.1109/TAFFC.2023.3296373 Understanding the spread of COVID-19 [6]. Festinger L.. (1962). A Theory of Cognitive Dissonance misinformation on social media: The effects of topics (Stanford University Press, Stanford, California). and a political leader's nudge. Journal of the [7]. Fu, Xin, & Shen, Yun (2014). Study of collective user Association for Information Science and Technology, behaviour in Twitter: A fuzzy approach. Neural 73(5). 726-737. Computing and Applications, 25(7), 1603–1614. [19]. Watts Duncan, J., & Dodds, Peter Sheridan (2007). [8]. Ge M.S., Mao R. & Cambria E. Explainable Metaphor Influentials, networks, and public opinion formation. Identification Inspired by Conceptual Metaphor Journal of Consumer Research, 34(4), 441–458. Theory. 36th AAAI Conference on Artificial [20]. Wang, Ru, Rho, Seungmin, Bowei, Chen, & Wandong, Intelligence. FEB 22-MAR 01, 2022,10681-10689. Cai (2017). Modeling of large-scale social network [9]. Kim, E., Ding, M. (Annie), Wang, X. (Shane), & Lu, S. services based on mechanisms of information (2023). Does Topic Consistency Matter? A Study of diffusion: Sina Weibo as a case study. Future Critic and User Reviews in the Movie Industry. Journal Generation Computer Systems, 74, 291–301. of Marketing, 87(3), 428–450. [21]. Xu, Jiuping, Tang, Weiyao, Zhang, Yi, & Wang, https://doi.org/10.1177/00222429221127927 Fengjuan (2020). A dynamic dissemination model for [10]. Li L., Wang Z., Zhang Q. & Wen H. (2020). Effect of recurring online public opinion. Nonlinear Dynamics, anger, anxiety, and sadness on the propagation scale 99, 1269–1293. of social media posts after natural disasters. [22]. Yang Yi & Subramanyam Ramanath(2023).Extracting Information Processing & Management, 57(6), Article Actionable Insights from Text Data: A Stable Topic 102313. Model Approach.MIS Quarterly, 47(3), 923-954. [11]. Li Jing , Shiqi Zhang, Wenting Ao. Why is instant [23]. Yin Dezhi , Triparna de Vreede, Logan M. Steele, Gert- messaging not instant? Understanding users’ negative Jan de Vreede (2022) Decide Now or Later: Making use behavior of instant messaging software, Sense of Incoherence Across Online Computers in Human Behavior, Volume Reviews. Information Systems 142,2023,107655. Research.https://doi.org/10.1287/isre.2022.1150. [12]. Liu Huiting , Chen Yi , Li Peipei , Zhao Peng & Wu [24]. Zhong Ning & David A. Schweidel (2020). Capturing Xindong. Enhancing review-based user representation Changes in Social Media Content: A Multiple Latent on learned social graph for recommendation. Changepoint Topic Model. Marketing Science, 39 (4), Knowledge-Based Systems,Volume 266,2023,110438. 827–46 [13]. Mao Rui , Lin Chenghua & Guerin F.. (2018). Word [25]. Zhang Yi, Lu Jie, Liu Feng, Liu Qian, Porter Alan, Chen Embedding and WordNet Based Metaphor Hongshu & Zhang Guangquan. Does deep learning Identification and Interpretation. In Proceedings of help topic extraction? A kernel k-means clustering the 56th Annual Meeting of the Association for method with word embedding. Journal of Computational Linguistics. Volume 1, 1222–1231, Informetrics, 12(4), 1099-1117. 49 Appendix 1 Figures & Table in the present study Stage 1: Figuring out the key factors Social variables Comment-fixed A: Data collection & data cleaning Objective: technique & tools model Identifying the key B: factors studied in literature Content variables Opinion leader factors influencing (UGC; user behavior) review literature Emotion coupling information C: key factors in present study (information coupling) review data rediffusion process Semantic coupling Comment Stage 2: Measurement of the key factors Cognition coupling Objective: D: information coupling technique & tools measuring three emotional coupling sentiment analysis-BERT & Difference aspects of to p ic a n a ly s is - L D A +K - m e a n s semantic coupling +Cosine similarity Repost-fixed information coupling cognitive coupling metaphor analysis-WordNet+Cosine Content variables model similarity E: social variable: opinion leader Emotion coupling Stage 3: Model regression and validation Content coupling Repost Objective: F: Negative binominal regression Digging out the independent variable: comment-fixed model impacts of repost-fixed model Cognition coupling Control variables information dependent variable: three factors of information coupling coupling on G: Moderating effect Gender; Opinion leader Verification; information moderate variable: opinion leader rediffusion process H: Model validation Social variables User posts; Followed users Working mechanism of information coupling on information rediffusion process Figure 1: Conceptual model of the present study Figure 2: research framework of the present study Sentiment polarity of information source and UGC Sentiment polarity Sentiment polarity of information source Sentiment polarity of UGC Difference of Sentiment polarity Figure 3: Emotional fluctuation in time span Table 2 Mean, standard error and correlation variables in comment-fixed effect model variables M SD Comment-fixed models Model 1 Model 2 Model 3 Model 4 Model 5 Emotional coupling 1.084 0.557 1.007** 1.210** 1.014** 1.001** Semantic coupling 1.033 0.034 0.667*** 0.698*** 0.699*** Cognitive coupling 1.401 0.505 0.637*** 0.658*** Opinion leader 3.706 0.007 0.227* Gender 0.800 0.201 0.450** 0.417** 0.415** 0.454** 0.421** Verification 1.462 0.211 0.599*** 0.554*** 0.534*** 0.522*** 0.535*** User posts -9.895 1.105 -1.122*** -1.145*** -1.146*** -1.136*** -1.131*** Followed users -2.566 0.001 0.487*** 0.424*** 0.402*** 0.467*** 0.435*** R2 0.645 0.786 0.782 0.784 0.788 Note: * p < .05. ** p < .01. *** p < .001. Table 3 Mean, standard error and correlation variables in repost-fixed effect model variables M SD Repost-fixed models Model 6 Model 7 Model 8 Model 9 Model 10 Emotional coupling 1.084 0.557 0.946** 0.958** 0.954** 0.967** Semantic coupling 1.033 0.034 0.417*** 0.535*** 0.447*** Cognitive coupling 1.401 0.505 0.668*** 0.674*** Opinion leader 3.706 0.007 3.388** Gender 0.800 0.201 0.857** 0.842** 0.756** 0.631** 0.817** Verification 1.462 0.211 2.345** 2.398** 2.452** 2.354** 2.315** User posts -9.895 1.105 -0.475*** -0.425*** -0.397*** -0.545*** -0.465*** Followed users -2.566 0.001 0.035*** 0.041*** 0.042*** 0.038*** 0.048*** R2 0.771 0.782 0.781 0.786 0.788 Note: * p < .05. ** p < .01. *** p < .001. Table 4 The moderated mediation effect of opinion leader on comment and repost Variables Model 11 (comment) Model 12 (repost) Emotional coupling × opinion 2.317*** 0.389*** leader Semantic coupling × opinion 0.532*** 2.359*** leader Cognitive coupling × opinion 2.304*** 2.707*** leader gender 0.454** 0.631** Verification 0.522*** 2.354** User posts -1.136*** -0.545*** Followed users 0.467*** 0.038*** R2 0.527 0.642 50