Tribalism and Fake News: Descriptive and Predictive Models on How Belief Influences News Trust Uroš Sergaš1 , Habil Kalkan2 and Marko Tkalčič1 1 University of Primorska, Faculty of Mathematics, Natural Sciences and Information Technologies, Glagoljaška 8, SI-6000 Koper, Slovenia 2 Gebze Technical University, Faculty of Enginering, Cumhuriyet mah, 2254. Sk. No:2, TR-41400 Gebze/Kocaeli, Turkey Abstract There are studies that have investigated the perception or the impact of trusting fake news. There are also articles describing how divisions in society arise and what the consequences are. However, there are few studies that have looked at the divisiveness of society on social networks and how it manifests itself in trust in (fake) news. The problem our research would like to address is how fake news and the so-called tribalism are connected. We set out too see whether people tend to seek for information that validates their current belief, even if that information is untrue, rather than seeking for the truth. Based on existing research, we created a questionnaire that combined demographic questions, questions about trust, the big five factors, a quiz where a person was asked to spot the fake news and questions that asked to determine the tribe of an individual. We also set up a website that mimicked currently popular social networks. Using this, we recorded users’ actions, which was an integral part of the individual’s participation in this research. The total number of respondents was 138, 69 men and 69 women, mostly from Slovenia and elsewhere in Europe, but also from Asia and North America. The data were cleaned, normalised, factorised and processed. We used various techniques to create new features from the existing data, which helped us in the next step. This was to set up various models in order to obtain the highest possible level of prediction accuracy through nested cross-validation. The experiments we carried out shows that, based on an individual’s behaviour on a social network, it is possible to determine which tribe he or she belongs to and which news stories they will believe. The results also show that exploring social science questions using machine learning has great potential for future work. Keywords tribalism, fake news, media trust, descriptive models, 1. Introduction Many of us can not imagine a day in which we would not willingly scroll through our favourite news source in order to get updated on global and local happenings. Some would follow this up with a discussion about read topics with their friends, coworkers or acquaintances in person or even online. On social media sites, user-created posts are often formatted as short messages, written using such language, as if it text would be a definite fact. Ironically, quite often they are quite far from the truth and sometimes they are straight out lies. A problem emerges when the post gets shared among groups of people, that unconditionally believe in the information received, just because it might fully support their view on the topic. What would happen when Human-Computer Interaction Slovenia 2022, November 29, 2022, Ljubljana, Slovenia Envelope-Open uros.sergas@upr.si (U. Sergaš); hkalkan@gtu.edu.tr (H. Kalkan); marko.tkalcic@famnit.upr.si (M. Tkalčič) © 2022 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). CEUR Workshop Proceedings http://ceur-ws.org ISSN 1613-0073 CEUR Workshop Proceedings (CEUR-WS.org) people from such groups read a news article that provides information with which said group of people strongly disagree. Would they believe such news or would they mark them as fake? And how would a person who agrees with the news react? Would they question it at all or would they accept it as truth since it would seem sensible to them? All these questions are getting more and more present in our lives, specially with the growth of media sources, information they produce and the so-called ”fear of missing out” (FOMO). The aforementioned reasons and the current world events caused by Covid-19 pandemic, all contribute to a greater polarisation of the society which inevitably gets divided into smaller ideological bubbles, called ”tribes”. FOMO and polarisation are both cause for and a product of the growth of fake news. In order for us to combat the spread of fake news, we need to have a better understanding as to what the reasons are for someone to believe a news article. We have looked into [1] so that we had a better understanding to how news (media) trust is defined. We have also taken a look into research that focused on detection of fake news using text analysis algorithms. They have developed various methods for early detection of fake news, which in turn reduces their spread [2, 3]. There was also research from the sociological point of view on the topic [4, 5]. The results of the studies suggested that online trust, self-disclosure, FOMO, and social media fatigue are positively associated with the sharing fake news (intentionally). Despite these topics being researched before there exists a gap when it comes to modelling profiles of social media sites users, their news susceptibility and ideology alignment. The purpose of this research is to perform an analysis on the data received from a social media website and to try and predict tribe and susceptibility to news for each user. In other words, we are trying to see which properties or aspects will tell us what a social media user will believe. 2. Related Work This section will cover related work that inspired our research. There have been multiple other researches that were studied, but for the sake of brevity we will focus on the ones that had the most influence for us. We decided to categorise them into three groups based on their research field, those being - fake news detection, why people share fake news, in other words, the psychology behind the spread of fake news and polarisation caused by fake news, which deals with effects that fake news have. 2.1. Fake News Detection Quite a vast percentage of research that focuses on detecting fake news is done on social media websites. Sometimes the algorithms developed get incorporated into social media websites themselves. For example, authors Buntain and Golbeck from University of Maryland developed a model, that is able to automatically detect fake news on Twitter [2]. The model included 45 features. These could be categorised in multiple groups, such as structural features (number of followers, if the account is verified or not, number and length of posts and similar) and content features which would focus mostly on emotion in the text. Algorithm was tested on two data sets, where they reached up to 70,28% accuracy. Accuracy was confirmed with Buzz-feed’s fact checkers [2]. [3] describes the current state of battling fake news spread and provides a set of algorithms that can be used for fake news detection. Algorithms were developed on the basis of ”social context”, meaning that they were detecting fake news on three levels. First was to detect a fake news on the user level, which means that they took user behaviour into account. The second level focused on the posts themselves, where the authors checked the credibility of the source based on the sentiment from the text. The third level consisted of the detection of fake news within the network. They checked whether users’ friends are trust worthy, which groups he/she follows and his/her social hierarchy on the website [3]. 2.2. Why People Share Fake News Among the articles that were researching the psychological side of the susceptibility to fake news, two inspired our research [4, 5]. First one proves some different reasons as to why people share fake news. They have discovered that trust in news online shares a negative correlation with the news authentication before it is being shared. People who are more keen to trust information online will also be the ones that share fake news more often. They have also noticed that sharing fake news is closely tied to self-disclosure and FOMO [4]. The latter study of the mentioned two had shown that there was no strong correlation between trusting the news article and sharing it. For example, attendees on the survey were asked to spot fake news among the real ones, focusing only on the news titles. When they were asked if they would still share the news, the credibility of the news article had little to none impact. This in turn means that people might still share a news article that they thought was fake [5]. 2.3. Polarisation Caused by Fake News Most related research to ours are [6, 7]. Authors of article titled [7] is looking into effects of the so called comfort zone bubbles in the relation to political tribalism on Twitter. It focused on two major tribes - left wing and right wing in politics. Authors note that tribe affiliation online gets stronger more often than it gets weaker. They have also found out that an average social media user, can quite easily get their own audience, which in turn can result in forming a new tribe, that could potentially share fake news that the tribe members would like to hear [7]. The authors in [6] were looking into psychological factors of fake news trust and resistance to correction. The paper explores the impact of inaccurate information and how it influenced individuals’ decision-making during the Covid-19 epidemic. The authors provide a model of cognitive guides (intuitive reasoning, cognitive errors, the illusion of truth) and guides that influence individuals because of society (source cues, emotions and worldview) and use them to explain how false information can break through the barriers of an individual’s beliefs. The researchers conducted tests in which they used facts and evidence to try to break down a person’s mental barriers and persuade them to stop believing the false information they had been believing. This has not always been successful, but they have come to various conclusions and considerations about how the world might deal with this in advance. They also came to the interesting conclusion that if you expose an individual to miserable attempts at disinformation initially, they will develop a milder immunity to the disinformation they will encounter in the future. However, they suggest that more research should be done with a larger data-set, to include misinformation in forms other than text, and to focus on analysing texts in languages other than English [6]. 3. Methodology This section serves to present all the means we have used throughout this research to obtain our results. We will also briefly present the sample in the last subsection to this chapter. 3.1. Questionnaire and Quiz Completing the questionnaire was the first part of taking part in the research. In total, the survey contained forty questions, which can be grouped into five sections or clusters. • Socio-demographic questions • Questions about trust in media and science • Big Five personality traits • Quiz • State of vaccination against Covid-19 The quiz in the questionnaire consisted of ten cut-out images of short news stories, which are then used to decide whether the news story is true or false. These ten news items can be divided into three groups. The first group contains news promoting the Covid-19 vaccine and describing its benefits. Of these, two are true and two are false. The second group contains four news items that oppose the vaccination or show its negative consequences. Here again, half of the news items are true and half are false. The last group contains so-called neutral news, which do not lean towards any tribe, but simply give some information about vaccination, without referring to its advantages or disadvantages. This group contains only two news items, one true and one false. The news were presented to the respondents in random order and the respondents were asked to spot fake and true news. 3.2. Social Media Behaviour To choose which data we want to record about a user’s online behaviour, we set up our own mock social media website. It showed the respondents 50 posts, of which 20 were in favour of the Covid-19 vaccination, 20 were against it and 10 were neutral. The posts were displayed in random order. Each post contained a short text, which was either a quote, a fact or a short news item. The post contents were all scraped from actual user written tweets from Twitter.com. The posts also contained a username and a profile picture. In addition, there were buttons at the bottom of the post that allowed the user to react to the post. The user could mark a post with “like” or “dislike”. They could also write their opinion in a form of a comment below the post. In addition, website allowed for respondents to write their own post as well. User actions were tracked and stored appropriately in a database where we recorded the type of action, the time at which the action was taken and the users’ identification number obtained when they entered the website. Performing mentioned actions on the mock social media website ended respondents participation in the survey. 3.3. Creating Features Once the data was collected we cleaned it, using various methods such as normalisation and factorisation. When our data was completely readable to the computer, we started creating new features in our data-set. We grouped all the questions about trust into two variables - trust in media and trust in Covid-19 media coverage. We calculated all the Big5 questions into 5 variables. From the quiz we extracted variables as to which news they trusted and which tribe they aligned with more. The mock social media website gave us the most features which consisted of likes and dislikes ratios, based on which tribe the post was aligned to. We also formed some variables from comments and user-posts that we ran natural language processing on, to read the sentiment of the text. 3.4. Sample presentation In this section, we will describe the sample that our respondents formed. Sample consisted of 138 individuals, 69 women and 69 men. The survey was shared internationally, so the questionnaire and the fictitious social network were created in English. Respondents came from 27 different countries, with the majority, 73 percent, from Slovenia, 22% from the rest of Europe and 5% from Asia or North America. Respondents were divided into four age ranges. 41% were between 18-25 years old, 54% between ages 26-35, 5% between 36-60 and 1% 61 or older. The questionnaire also asked respondents about their current education level and employment status. Education was divided into six levels, ranging from primary school to PhD. Here the majority, 41% of respondents have reached bachelors degree, 24% reached a level of master, 24% finished high school, 5% middle school, 4% had PhD’s and 1% had elementary school level. The employment status of respondents was divided into four classes, namely “employed” - 57%, “unemployed” - 9%, “student” - 32% or “retired” - 2%. 4. Results We first performed a correlational analysis among the groups of variables collected from the questionnaire. Fig 1 shows the results. From the correlation matrix we can see that most of the values have a very low degree of dependence to each other, which should not be surprising as the questionnaire was designed to measure as many different independent values as possible. As we can see from the graph, the values measuring trust in news, the news about Covid 19 and the trust in individual news sources were the most correlated. It can also be seen that some values correlate negatively between questions on the quiz - with each other and with vaccination status. However we have taken a closer look into a few correlations which we will show now. Figure 1: Correlation values between variables from the questionnaire 4.1. Is There a Correlation Between an Individual’s Prior Beliefs and Their Trust in the News? We decided to test the dependence of the variables “mediaAvgTrust” (average of responses to questions on general trust in the media) and “c19AvgTrust” (average of responses to questions on trust in Covid-19 news) on two sets - vaccinated and non-vaccinated. First, we used a two-sample Kolmogorov-Smirnov test to check whether the vaccinated and non-vaccinated sets have the same distribution of data, which we found that they do not. For the “mediaAvgTrust”, we used a two-sample t-test. The vaccinated set returned a mean of 2.983 and the non-vaccinated set returned a mean of 2.420. The p-value of the given test was 8,265 x 10−4 which tells us that the two sets are not equal. For “c19AvgTrust”, we repeat the procedure with a t-test. Here the mean of the vaccinated set was 3.579 and of the non-vaccinated set 2.31. The P-value of the test was 3,551 x 10−10 , which again tells us that the two sets are not identical. As we can see, it can be argued that there is a link between an individual’s prior belief and their trust in media. On average, individuals belonging to the non-vaccinated tribe trusted everyday news less, and they trusted Covid-19 news even less while people who were vaccinated against Covid-19 were generally more trusting. 4.2. Will Individuals Belonging to the Same Tribe on an Imaginary Social Network Mostly Take Positive Actions on Posts That Support Their Opinions and Negative Actions on Those That do Not? We checked the correlations between the number of likes/dislikes on the posts that appeared to the respondents on the mock social network. Table 1 List of average numbers of likes and dislikes per each type of a post whether it was a pro-vaccination or anti-vaccination post Average numbers of likes/dislikes per type of post Tribe Likes on Provax Dislikes on Provax Likes on Antivax Dislikes on Antivax Pro-vaccine 5,183 1,946 1,849 5,183 Anti-vaccine 1,567 4,567 4,4 2,4 We notice that the number of positive votes on posts that support their opinion is always higher on average than on posts that do not support their opinion. After calculating p-values for all cases, we can say with certainty that there is a difference between the actions taken according to the tribe, with an exception being the case of dislikes done by non-vaccinated tribe. There, it turned out that members of this tribe were more likely to vote negatively on publications in general, as the high correlation between the two sets and the fact that in this case, due to the too high p-value, we would have to accept the null hypothesis tell us. In addition to the votes on the posts, we also wanted to see if there was a correlation between the comments made by members of one tribe and the other. Since we were testing two samples for each tribe, we used a two-sample t-test. Table 2 List of P-values according to emotion criteria and post type P-value according to the measurement of emotion Post type Anger Joy Optimism Sadness Polarisation Subjectivity Pro-vaccine 0,081 0,023 0,934 0,509 0,135 0,376 Anti-vaccine 0,691 0,161 0,392 0,937 0,896 0,51 Neutral regarding vaccination 0,989 0,083 0,471 0,329 0,917 0,998 As for the comments, we see from the table 2 that they are too similar or that the p-values are too high in almost all cases to suggest that the comments differ between the vaccinated and non-vaccinated tribes. If we were to look at comments alone, we would accept the null hypothesis, which would imply that individuals, regardless of tribe, make the same types of comments. Taking both types of social network actions together, if we give more weight (or heavier weights) to the traits obtained from voting than to the traits obtained from comments, we can say that for most cases individuals belonging to the same tribe indeed made positive actions on posts that supported their opinions and negative actions on those that did not. 4.3. Do Personality Traits Influence Beliefs and Tribal Affiliation? In the case of this research question, we first performed a t-test on two samples, vaccinated and non-vaccinated, for each of the personality factors we measured. Table 3 List of P-values by personality type Personality factor Extra-version Neuroticism Openness Acceptance Conscientiousness P-value 0,7353 0,4945 0,01341 0,2703 0,00851 Notice that the two personality types, openness and conscientiousness, can be said to indeed differ sufficiently between tribes to reject the null hypothesis and accept the alternative that personality traits do have an effect. However, the same cannot be said for the other personality types. 5. Discussion The results, obtained by means of statistical tests and correlation calculations, show that there is a correlation between individuals’ prior beliefs and their trust in media. We also showed that individuals who belong to the same tribe will tend to perform positive actions on posts that support their opinions and negative actions on those that do not on the mock social media website. However, when we set out to test whether personality traits influence beliefs and tribe alignment, we found that only two personality types, openness and conscientiousness, can be said to be sufficiently differentiated between tribes, while the other three, extra-version, agreeableness and neuroticism cannot. We also found that an individuals’ prior beliefs quite strongly dictate which news items they trust and which they do not, depending on whether they support their opinion or not. This is not only reflected in the reading of journalistic and scientific articles, but also in online behaviour. Individuals are more likely to react positively to publications that support their views and negatively to those that oppose them, but this should not come as a huge surprise. 6. Limitations and Future Work The survey has shown that it has some limitations, which in some cases have had a significant impact on the results we have obtained. The first and perhaps the biggest drawback is that we had a small pool of individuals participating in the survey, which means that the results often varied from one another. It would also have been better to have more non-vaccinated respondents, as less than 1/4 of the respondents were vaccinated, which also affected the accuracy of the models and the high threshold of the reference values. It is likely that the results would have been different if the survey had been prepared and distributed a few months earlier, as fewer people were vaccinated and/or the uncertainty about vaccines was higher then. More time should be spent on the design of the mock social network. It should have en- couraged respondents to participate more, as many appeared to be “shy”. Ideally, everyone would have written at least three comments or posts and voted on at least 15 posts. Then we would have more meaningful data to work with. We should have designed natural language processing features better as well, which we could not due to the lack of time. These turned out to be quite irrelevant in our case, which again affected the overall results. The conduct of the research and the results obtained give us an even greater desire to carry out the research on a larger scale with the aforementioned limitations resolved. We hope that this research inspires or even helps some researchers who are going to explore similar social issues through data science. It can serve as an ideal tool for integrating natural sciences, computer science and statistics with social sciences such as psychology, sociology and so on. References [1] J. Strömbäck, Y. Tsfati, H. Boomgaarden, A. Damstra, E. Lindgren, R. Vliegenthart, T. Lindholm, News media trust and its impact on media use: toward a framework for future research, Annals of the International Communication Association 44 (2020) 139–156. URL: https://doi.org/10.1080/23808985.2020.1755338. doi:10.1080/23808985. 2020.1755338 . arXiv:https://doi.org/10.1080/23808985.2020.1755338 . [2] C. Buntain, J. Golbeck, Automatically identifying fake news in popular twitter threads, in: 2017 IEEE International Conference on Smart Cloud (SmartCloud), 2017, pp. 208–215. doi:10.1109/SmartCloud.2017.40 . [3] K. Shu, H. Liu, J. Han, Detecting Fake News on Social Media, Morgan Claypool Publishers, 2019. [4] S. Talwar, A. Dhir, P. Kaur, N. Zafar, M. Alrasheedy, Why do people share fake news? associations between the dark side of social media use and fake news sharing behav- ior, Journal of Retailing and Consumer Services 51 (2019) 72–82. URL: https://www. sciencedirect.com/science/article/pii/S0969698919301407. doi:https://doi.org/10.1016/ j.jretconser.2019.05.026 . [5] G. Pennycook, The psychology of fake news, Trends in Cognitive Sciences 25 (2021). doi:10.1016/j.tics.2021.02.007 . [6] U. Ecker, S. Lewandowsky, J. Cook, P. Schmid, L. Fazio, N. Brashier, P. Kendeou, E. Vraga, M. Amazeen, The psychological drivers of misinformation belief and its resistance to correction, Nature Reviews Psychology 1 (2022) 13–29. doi:10.1038/s44159- 021- 00006- y . [7] M. Seemann, Digital tribalism the real story about fake news, https://www.ctrl-verlust.net/ digital-tribalism-the-real-story-about-fake-news/, 2017.