<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">Do We Trust What They Say, or What They Do? A Multimodal User Embedding Provides Personalized Explanations</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Zhicheng</forename><surname>Ren</surname></persName>
							<affiliation key="aff0">
								<orgName type="institution">Aurora Innovation</orgName>
								<address>
									<addrLine>280 N Bernardo Ave</addrLine>
									<postCode>94043</postCode>
									<settlement>Mountain View</settlement>
									<region>CA</region>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Zhiping</forename><surname>Xiao</surname></persName>
							<affiliation key="aff1">
								<orgName type="institution">University of Washington</orgName>
								<address>
									<addrLine>1410 NE Campus Pkwy</addrLine>
									<postCode>98195</postCode>
									<settlement>Seattle</settlement>
									<region>WA</region>
								</address>
							</affiliation>
						</author>
						<author role="corresp">
							<persName><forename type="first">Yizhou</forename><surname>Sun</surname></persName>
							<email>yzsun@cs.ucla.edu</email>
							<affiliation key="aff2">
								<orgName type="institution">University of California</orgName>
								<address>
									<postCode>90095</postCode>
									<settlement>Los Angeles, Los Angeles</settlement>
									<region>CA</region>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Kevin</forename><surname>Mccarthy</surname></persName>
						</author>
						<author>
							<persName><forename type="first">Bernie</forename><surname>Sanders</surname></persName>
						</author>
						<author>
							<affiliation key="aff3">
								<orgName type="department">⇤ All work</orgName>
								<orgName type="institution">done at University of California</orgName>
								<address>
									<settlement>Boise, Los Angeles</settlement>
									<region>Idaho</region>
									<country key="US">USA</country>
								</address>
							</affiliation>
						</author>
						<author>
							<affiliation key="aff4">
								<orgName type="institution">Ivanka Trump Elon Musk</orgName>
							</affiliation>
						</author>
						<title level="a" type="main">Do We Trust What They Say, or What They Do? A Multimodal User Embedding Provides Personalized Explanations</title>
					</analytic>
					<monogr>
						<idno type="ISSN">1613-0073</idno>
					</monogr>
					<idno type="MD5">3E9EB8305BFFE55F631C3EBABA84810B</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2025-04-23T19:00+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<textClass>
				<keywords>
					<term>Multi-modal representation learning</term>
					<term>Social network analysis</term>
					<term>User embeddings</term>
				</keywords>
			</textClass>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>With the rapid development of social media, the importance of analyzing social network user data has also been put on the agenda. User representation learning in social media is a critical area of research, based on which we can conduct personalized content delivery, or detect malicious actors. Being more complicated than many other types of data, social network user data has an inherent multimodal nature. Various multimodal approaches have been proposed to harness both the text (i.e. post content) and the relation (i.e. inter-user interaction) information to learn user embeddings of higher quality. The advent of Graph Neural Network models enables more end-to-end integration of user text embeddings and user interaction graphs in social networks. However, most of those approaches do not adequately elucidate which aspects of the data -text or graph structure information -are more helpful for predicting each speci c user under a particular task, putting some burden on personalized downstream analysis and untrustworthy information ltering. We propose a simple yet e ective framework called Contribution-Aware Multimodal User Embedding (CAMUE) for social networks. We have demonstrated with empirical evidence, that our approach can provide personalized explainable predictions, automatically mitigating the impact of unreliable information. We also conducted case studies to show how reasonable our results are. We observe that for most users, graph structure information is more trustworthy than text information, but there are some reasonable cases where text helps more. Our work paves the way for more explainable, reliable, and e ective social media user embedding, which allows for better-personalized content delivery.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.">Introduction</head><p>The advancement of social networks has placed the analysis and study of social network data at the forefront of priorities. User-representation learning is a powerful tool to solve many critical problems in social media studies. Reasonable user representations in vector space could help build a recommendation system <ref type="bibr" target="#b0">[1,</ref><ref type="bibr" target="#b1">2]</ref>, conduct social analysis <ref type="bibr" target="#b2">[3,</ref><ref type="bibr" target="#b3">4,</ref><ref type="bibr" target="#b4">5]</ref>, detect bot accounts <ref type="bibr" target="#b5">[6,</ref><ref type="bibr" target="#b6">7,</ref><ref type="bibr" target="#b7">8]</ref>, and so on. To obtain userembeddings of higher quality, many multimodal methods are proposed to fully utilize all types of available information from the social networks, including interactive graphs, user pro les, images, and texts from their posts <ref type="bibr" target="#b8">[9,</ref><ref type="bibr" target="#b9">10,</ref><ref type="bibr" target="#b10">11,</ref><ref type="bibr" target="#b11">12]</ref>. Compared with models using single modality data, multimodal methods utilize more information from social media platforms. Hence they usually achieve better results in downstream tasks.</p><p>Among all modalities in social networks, user-interactive graphs (i.e., what they do) and text content (i.e., what they say) are the two most frequently used options, due to their good availability across di erent datasets and large amounts of observations. The graph-neural-network (GNN) models <ref type="bibr" target="#b12">[13,</ref><ref type="bibr" target="#b13">14,</ref><ref type="bibr" target="#b14">15]</ref> makes it more convenient to fuse both the text information and graph-structure information of social-network users, where text-embeddings from language-models such as GloVe <ref type="bibr" target="#b15">[16]</ref> or BERT <ref type="bibr" target="#b16">[17]</ref> are usually directly incorporated into GNNs as node attributes. Although those approaches have achieved  great performance in a bunch of downstream tasks <ref type="bibr" target="#b17">[18]</ref>, the text information and graph-structure information are fully entangled with each other, which makes it hard to illustrate the two modalities' respective contributions to learning each user's representation.</p><p>Researchers have already found that di erent groups of users can behave very di erently on social media <ref type="bibr" target="#b18">[19]</ref>. If such di erences are not correctly captured, it might cause signi cant bias in the user attribute prediction (e.g., political stance prediction) <ref type="bibr" target="#b19">[20]</ref>. Hence, when learning multi-modal user representation for di erent users, it is not only important to ask what the prediction results are, but also important to ask why we are making such predictions (e.g. Are those predictions due to the same reason?). Only in that way, we could provide more insights into the user modelings, and potentially enable unbiased and personalized downstream analysis for di erent user groups.</p><p>On the other hand, under a multi-modality setting, if one aspect of a user's data is not trustworthy and misleading, it might still be fused into the model and make the performance lower than single-modality models <ref type="bibr" target="#b20">[21]</ref>. Consider the case when we want to make a political ideology prediction for Elon Musk based on his Twitter content before the 2020 U.S. presidential election (Figure <ref type="figure" target="#fig_1">1</ref>), when he has not revealed his clear Republican political stance yet. If we trust the follower-followee graph structure information, we can see that he is likely to be a Republican, since he follows more Republicans than Democrats, and has more frequent interactions with the veri ed Republicans accounts. However, in his tweet content, his word choice also shows some Democratic traits. Due to the existence of such con icting information, being able to automatically identify which modality is more trustworthy for each individual becomes essential in building an accurate social media user embedding for di erent groups of users.</p><p>To address the above two shortcomings of text-graph fusion in social networks, we propose a simple yet e ective framework called Contribution-Aware Multimodal User Embedding (CAMUE), which can identify and remove misleading modality from speci c social network users during text-graph fusion, in an explainable way. CAMUE uses a learnable attention module to decide whether we should trust the text information or the graph structure information when predicting individual user attributes, such as political stance. Then, the framework outputs a clear contribution map for each modality on each user, allowing personalized explanations for downstream analysis and recommendations. For ambiguous users whose text and graph structure information disagree, our framework could successfully mitigate unreliable information among di erent modalities by automatically adjusting the weight of that information accordingly.</p><p>We conduct experiments on the TIMME dataset <ref type="bibr" target="#b20">[21]</ref> used for a Twitter political ideology prediction task. We observed that our contribution map can give us some interesting new insights. A quantitative analysis of di erent Twitter user sub-groups shows that link information (i.e., interaction graph) contributes more than text information for most users. This provides insights that political advertising agencies should gather more interaction graph information of Twitter users in the future when creating personalized advertisement content, instead of relying too much on their text data. We also observe that when the graph and text backbone are set to R-GCN and GloVe respectively, our approach ignores the unreliable GloVe embedding and achieves better prediction results. When the text modality is switched to a more accurate BERT embedding, our framework can assign graph/text weights for di erent users accordingly and achieve comparable performance to existing R-GCN-based fusion methods. We pick 9 celebrities among the 50 most-followed Twitter accounts<ref type="foot" target="#foot_0">1</ref> , such as Elon Musk. A detailed qualitative analysis of their speci c Twitter behaviors shows that our contribution map models their online behaviors well. Finally, we run experiments on the TwiBot-20-Sub dataset <ref type="bibr" target="#b21">[22]</ref> used for a Twitter human/bot classi cation task, showing that our framework could be generalized to other user attribute prediction tasks. By creating social media user embeddings that are more explainable, reliable, and e ective, our framework enables improved customized content delivery.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.">Preliminaries and Related Work</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.1.">Multimodal Social Network User Embedding</head><p>Social network user embedding is a popular research eld that aims to build accurate user representations. A desirable user embedding model should accurately map sparse user-related features in high-dimensional spaces to dense representations in low-dimensional spaces. Multimodal social network user embedding models utilize user di erent types of user data to boost their performance. Commonly-seen modality combinations include graph-structure (i.e. link) data and text data <ref type="bibr" target="#b22">[23]</ref>, graph-structure data and tabular data <ref type="bibr" target="#b23">[24,</ref><ref type="bibr" target="#b24">25,</ref><ref type="bibr" target="#b8">9]</ref>, and graph-structure data, text data and image data altogether <ref type="bibr" target="#b25">[26]</ref> [27], etc.</p><p>Among those multi-modality methods, the fusion of graph-structure data and text data has always been one of the mainstream approaches for user embedding. At an earlier stage, without much help from the GNN models, most works trained the network-embedding and text-embedding separately and fused them using a joint loss <ref type="bibr" target="#b27">[28,</ref><ref type="bibr" target="#b28">29,</ref><ref type="bibr" target="#b29">30,</ref><ref type="bibr" target="#b30">31]</ref>. With the help of the GNN models, a new type of fusion method gained popularity, where the users' text-embeddings are directly incorporated into GNNs as node attributes <ref type="bibr" target="#b22">[23,</ref><ref type="bibr" target="#b31">32,</ref><ref type="bibr" target="#b32">33]</ref>.</p><p>Despite their good performance, all existing models do not explain how much the graph structure and the text information of speci c users contribute to the nal prediction results, making it di cult to give customized modality weight for downstream analysis or recommendations. Also, if one modality is poorly learned, it can be counter-e ective to the user embedding quality, making it even worse than their single-modality counterparts <ref type="bibr" target="#b20">[21]</ref>. How to address this problem in a universally-learned way instead of heuristic-based information ltering, has largely gone under-explored. Hence, we propose a framework that not only utilizes both text and graph-structure information, but also reveals their relative importance along with the prediction result.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.2.">Graph Neural Network</head><p>Graph Neural Network (GNN) is a collection of deep learning models that learn node embedding through iterative aggregation of information from neighboring nodes, using a convolutional operator. Most GNN architectures include a graph convolution layer in a form that can be characterized as message-passing and aggregation. A general formula for such convolution layers is:</p><formula xml:id="formula_0">H (l) = (A ˜H(l 1) W (l) ) ,<label>(1)</label></formula><p>where H (l) represents the hidden node representation of all nodes at layer l, operator is a non-linear activation function, and the graph-convolutional lter A ˜is a matrix that usually takes the form of a transformed (e.g., normalized) adjacency matrix A, and the layer-l's weight W (l) is learnable.</p><p>In the past few years, GNN models have reached SOTA performances in various graph-related tasks. They are widely regarded as promising techniques to generate node embedding for users in social-network graphs. <ref type="bibr" target="#b12">[13,</ref><ref type="bibr" target="#b33">34,</ref><ref type="bibr" target="#b13">14,</ref><ref type="bibr" target="#b14">15]</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.3.">Neural Network-based Language Models</head><p>The eld of natural language processing has undergone a signi cant transformation with the advent of neural-network-based language models. Word2Vec <ref type="bibr" target="#b34">[35]</ref> introduced two architectures: Continuous Bag-of-Words (CBOW) and Skip-Gram. CBOW predicts a target word given its context, while Skip-Gram predicts context words given a target word. GloVe <ref type="bibr" target="#b15">[16]</ref> model went beyond by incorporating global corpus statistics into the learning process. ELMo <ref type="bibr" target="#b35">[36]</ref> was another signi cant step forward, as it introduced context-dependent word representations, making it possible for the same word to have di erent embeddings if the context is di erent. BERT <ref type="bibr" target="#b16">[17]</ref> is a highly in uential model that is built on the transformer architecture <ref type="bibr" target="#b36">[37]</ref>, pre-trained on large text corpora using, for example, masked language modeling and next-sentence prediction tasks. Recently, large language models like GPT-3 <ref type="bibr" target="#b37">[38]</ref>, InstructGPT <ref type="bibr" target="#b38">[39]</ref>, and ChatGPT have achieved signi cant breakthroughs in natural-language-generation tasks. All those models are frequently used to generate text embedding for social network users.</p><p>Our framework does not rely on any speci c language model, and we do not have to use LLMs. Instead, we use language-models as a replaceable component, making it possible for both simpler ones like GloVe and more complicated ones like BERT to t in. We will explore some di erent options in the experimental section.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.4.">Multimodal Explanation Methods</head><p>In the past, several methods have been proposed to improve the interpretability and explainability of multimodal fusions <ref type="bibr" target="#b39">[40]</ref>. Commonly used strategies include attention-based methods <ref type="bibr" target="#b40">[41,</ref><ref type="bibr" target="#b41">42,</ref><ref type="bibr" target="#b42">43]</ref>, counterfactual-based methods <ref type="bibr" target="#b43">[44,</ref><ref type="bibr" target="#b44">45]</ref>, scene graph-based methods <ref type="bibr" target="#b45">[46]</ref> and knowledge graph-based methods <ref type="bibr" target="#b46">[47]</ref>. Unfortunately, most of them focus on the fusion of image modality and text modality, primarily the VQA task, while to the best of our knowledge, no work focuses on improving the explainability between the network structure data and text data in social-network user embedding.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.">Problem Definition</head><p>Our general goal is to propose a social network user embedding fusion framework that could answer: 1. which modality (i.e. text or graph structure, saying or doing) contributes more to our user attribute prediction, hence allowing more customized downstream user behavior analysis and 2. which modality should be given more trust for each user, and automatically lter out the untrustworthy information when necessary, in order to achieve higher-quality multi-modal user-embedding.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.1.">Problem Formulation</head><p>A general framework of our problem could be formulated as follows: given a social media interaction graph G = (V, E) with node set V representing users and edge set E representing links between users.</p><formula xml:id="formula_1">Let X = [x 1 , x 2 , x 3 , • • • , x n ] be the text content of n = |V| users, Y = [y 1 , y 2 , y 3 , • • • , y n ] be the labels of those users, A = [A 1 , A 2 , • • • , A m ]</formula><p>be the adjacency matrices of G, m be the number of link types and A i 2 R n⇥n , our training objective is:</p><formula xml:id="formula_2">min E [L (f (G, X) , Y)]<label>(2)</label></formula><p>Here, L is the loss of our speci c downstream task, and f is some function that combines the graph structure information and text information, producing a joint user embedding.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2.">Preliminary Experiment</head><p>To investigate the e ectiveness of the existing GNN-based multimodal fusion methods in ltering the unreliable modality when the graph structure and text contradict, we run experiments using a common fusion method that feeds the ne-tuned BERT features into the R-GCN backbone, similar to the approaches in <ref type="bibr" target="#b22">[23]</ref> and <ref type="bibr" target="#b9">[10]</ref>. We observe that this conventional fusion method fails to lter the unreliable information for some of those ambiguous users. Table <ref type="table" target="#tab_0">1</ref> shows two politicians whose Twitter data contains misleading information, either in the graph structure or text data. While the singlemodality backbones which are trained without misleading information give the correct predictions, the multi-modality fusion method is confused by the misleading information, hence it is not able to make correct predictions.</p><p>These insights revealed the importance of having a more exible and explainable framework for learning multimodal user embedding.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.">Methodology</head><p>We propose a framework of Contribution-Aware Multimodal User Embedding (CAMUE), a fusion method for text data and graph structure data when learning user embedding in social networks. The key ingredient of this framework is an attention gate-based selection module which is learned together with the link and text data, and decide which information we want to trust more for each particular user.</p><p>Our framework has three main parts: a text encoder, a graph encoder, and an attention-gate learner. The text content of each user passes through the text encoder and generates a text embedding for that user. The embedding is then passed through a three-layer MLP for ne-tuning. The adjacency matrix of the users passes through the graph encoder and generates a node embedding for that user. At the same time, both the text embedding and the graph adjacency matrix pass through our attention gate learner. The output of this module is two attention weights, ↵ and , which control the proportion of our graph structure information and text information. Without loss of generality, if we make R-GCN our graph encoder and BERT our text encoder, our model will be trained in the following way (Equation 3-6, also illustrated in Figure <ref type="figure" target="#fig_2">2</ref>):</p><formula xml:id="formula_3">H (1) = (concat(A 1 + A 2 + • • • + A m , BERTemb (X) W (1) )<label>(3)</label></formula><p>H (2) = (H (1) W (2) ) (4) </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Text-backbone Prediction</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Republican (Wrong)</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Fused Model Prediction</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Republican (Wrong)</head><p>[e ↵ , e ] = H (2) W (3)  (5)</p><formula xml:id="formula_4">↵ = softmax(e ↵ ), = softmax(e )<label>(6)</label></formula><p>Where H and W are hidden layers and weights of our attention gate learner,</p><formula xml:id="formula_5">X = [x 1 , x 2 , x 3 , • • • , x n ] is the text content, BERTemb is the BERT encoding module, A = [A 1 , A 2 , • • • , A m ]</formula><p>is the adjacency matrices of G and m is the number of link types.</p><p>Then, our overall training objective becomes:</p><formula xml:id="formula_6">min E[L((↵ + ) R-GCNemb(G) + ( + ) BERTemb(X), Y)]</formula><p>Here, acts as a regularizer to ensure our model is not overly dependent on a single modality.</p><p>Our methods o er two levels of separation. First, we separate the text encoder and the graph encoder to allow better disentanglement on which data contributes more to our nal prediction results. Second, we separate the learning of the downstream tasks and the learning of which data modality (i.e. text or graph structure) we can rely on more. This makes our framework adaptable to di erent downstream social media user prediction tasks. The learned trustworthiness of di erent modalities allows for autoadjustment of the weight between graph structure and text modalities, hence ltering any unreliable information once they are discovered.</p><p>Figure <ref type="figure" target="#fig_2">2</ref> shows the overall architecture of our framework, note that the graph structure encoder and text encoder could be replaced by any other models that serve the same purposes.  We give a short complexity analysis of our architecture for the case of R-GCN + BERT: Since we are using sparse adjacency matrix for R-GCN, the graph encoder part has a complexity of O(L graph EF graph + L graph NF 2 graph ) (according to <ref type="bibr" target="#b47">[48]</ref>), where L is the number of layers, E is the number of edges, N is the number of nodes, and F is the feature dimension. Since we xed the maximum text length to be a constant for the text encoder, it has a complexity of O(F 2 text ) (based on <ref type="bibr" target="#b48">[49]</ref>). Since F text and F graph are about comparable size, our fusion module has the complexity of O(F </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.">Experiments</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.1.">Tasks and Datasets</head><p>We run experiments on two Twitter user prediction tasks: 1. Predicting the political ideology of Twitter users (Democrat vs Republican) and 2. Predicting whether a Twitter user account is a human or a bot.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.1.1.">TIMME</head><p>TIMME <ref type="bibr" target="#b20">[21]</ref> introduced a multi-modality Twitter user dataset as a benchmark of political ideology prediction task for Twitter users. TIMME contains 21, 015 Twitter users and 6, 496, 112 Twitter interaction links. Those links include follows, retweets, replies, mentions, and likes. Together they form a large heterogeneous social network graph. TIMME also contains 6, 996, 310 raw Twitter content from those users. Hence, it will be a good dataset to study di erent fusion methods of text features and graph structure features. In TIMME, there are 586 labeled politicians and 2, 976 randomly sampled users with a known political a liation. Some of them are ambiguous users we investigated before. Labeled nodes belong to either Democrats or Republicans. Note that the dataset cut-o time is 2020, so the political polarities of many public gures (e.g. Elon Musk) have not been reviewed at that time.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.1.2.">TwiBot-20-Sub</head><p>TwiBot-20 <ref type="bibr" target="#b21">[22]</ref> is an extensive benchmark for Twitter bot detection, comprising 229, 573 Twitter accounts, of which 11, 826 are labeled as human users or bots. The dataset also contains 33, 716, 171 Twitter interaction links and 33, 488, 192 raw Twitter content. The links in TwiBot-20 include follows, retweets, and mentions. To further examine the generalizability of our method, we run experiments for Twitter bot account detection on the TwiBot-20 dataset. To reduce the computation cost of generating node features and text features, we randomly subsample 3, 000 labeled users and 27, 000 unlabeled users from the TwiBot-20 dataset, and form a new dataset called TwiBot-20-Sub. In this way, the size and label sparsity of the TwiBot-20-Sub dataset becomes comparable with the TIMME dataset.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.1.3.">Train-test Split</head><p>We split the users of both datasets into an 80%:10%:10% ratio for the training set, validation set, and test set respectively.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.2.">Implementation Detail</head><p>To test the e ectiveness of our framework across di erent models, we choose two single-modality text encoders, GloVe and BERT, and two single-modality graph encoders, MLP and R-GCN.</p><p>The GloVe embedding refers to the Wikipedia 2014 + Gigaword 5 (300d) pre-trained version. <ref type="foot" target="#foot_1">2</ref> The BERT embedding refers to the sentence level ([CLS] token) embedding of BERT-base model <ref type="bibr" target="#b50">[50]</ref> after ne-tuning the pre-trained model's parameters on the tweets from our training set consisting of 80% of the users. We chose a max sequence length of 32. After the encoding, we have a 300-dimension text embedding for GloVe and a 768-dimension text embedding for BERT.</p><p>We choose a modi ed version of R-GCN from TIMME <ref type="bibr" target="#b20">[21]</ref> as an R-GCN graph encoder. R-GCN <ref type="bibr" target="#b33">[34]</ref> is a GNN model speci cally designed for heterogeneous graphs with multiple relations. In the TIMME paper, it is discovered that assigning di erent attention weights to the relation heads of the R-GCN model could improve its performance. Hence, we adopt their idea and use the modi ed version of R-GCN. We did not use the complete TIMME model since it is designed for multiple tasks outside our research scope, and will overly complicate our model.</p><p>We also choose a 3-layer MLP as another graph encoder for comparison, the adjacency list for each user is passed to the MLP.</p><p>Large language models (LLMs) like ChatGPT are powerful in understanding texts, but they usually have a great number of parameters, making traditional supervised ne-tuning a hard and costly task <ref type="bibr" target="#b37">[38]</ref>. Instead, less resource-intensive methods like few-shot learning, prompt tuning, instruction tuning, and chain-of-thought are more frequently used to adapt LLMs on speci c tasks <ref type="bibr" target="#b51">[51]</ref>. We do not use large language models as one of the options for the text encoder since those methods are not compatible with our framework -they do not provide a well-de ned gradient to train our attention gate learner.</p><p>We run experiments on a single NVIDIA Tesla A100 GPU. We used the same set of hyper-parameters as in the TIMME paper, with the learning rate being 0.01, the number of GCN hidden units being 100, and the dropout rate being 0.1, on a PyTorch platform. For a fair comparison, we run over 10 random seeds for each algorithm on each task.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6.">Results and Analysis</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6.1.">Contribution Map</head><p>To show that our framework could e ectively provide personalized explanations during the fuse of modalities, we draw the contribution map based on ↵ (graph weight) and (text weight) attention for users from each dataset. The darker the color, the weight of the corresponding modality is closer to 1.</p><p>In the contribution map, pure white indicates a zero contribution (0) from a modality, while pure dark blue indicates a full contribution <ref type="bibr" target="#b0">(1)</ref>.</p><p>The top gure of Figure <ref type="figure" target="#fig_3">3</ref> shows the contribution map output when the text encoder is BERT and the graph encoder is R-GCN, on a subgroup of the TIMME dataset consisting of some politicians and some random Twitter users. As we can see, there is a clear cut between the percentage of contributions from di erent modalities to the nal prediction. It is notable that for the two ambiguous politician users we have mentioned earlier (Ryan Costello and Sheldon Whitehouse), CAMUE could give correct attention, where we should trust more text data from Mr. Costello while trusting more graph structure data from Mr. Whitehouse. To avoid any misuse of personal data information, we hide the names of random Twitter users and only include politicians whose Twitter accounts are publically available at <ref type="foot" target="#foot_2">3</ref> .</p><p>The bottom gure of Figure <ref type="figure" target="#fig_3">3</ref> shows the contribution map output when the text encoder is GloVe and the graph encoder is R-GCN, on the same subgroup of the TIMME dataset. Note that for all shown users text information does not contribute to the nal prediction. This could be attributed to the fact that GloVe is not very powerful for sentence embedding, especially when the text is long. This contribution map shows that our framework lters out the text modality almost completely when it is not helpful for our user embedding learning. As we can see from table 2, the traditional fusion method for GloVe+R-GCN only yields an accuracy of 0.840, which is much lower than the single graph structure modality prediction (0.953) using R-GCN, due to unreliable GloVe embedding. In contrast, our CAMUE method obtains a higher accuracy (0.954) than all single modality models, since it disregards unreliable information.</p><p>Figure <ref type="figure" target="#fig_4">4</ref> shows the contribution map output for the same set of encoders on a subgroup of the Twibot-20-Sub dataset. There is also a clear cut between the percentage of contributions from di erent modalities, for both the human Twitter accounts and bot accounts.</p><p>Hence, we verify that our framework could both provide personalized modality contribution and drop low-quality information during the fuse of modalities. Some quantitative analysis of how this low-quality information ltering could bene t the general model performance could be found in the next section, and some qualitative analysis about what new insights we could gain from the output of our framework could be found in the case study section.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6.2.">General Performance</head><p>Table <ref type="table" target="#tab_3">2</ref> shows the performance of CAMUE on di erent combinations of encoders. The traditional fusion method in Figure <ref type="figure" target="#fig_2">2</ref> is denoted as "simple fusion". For MLP, we do not have such a natural fusion method. We also add "CAMUE, xed params" as an ablation experiment to prove the e ectiveness of our attention gate-based selection module.</p><p>We observe that within those combinations, sometimes simple fusion methods are signi cantly worse than single-modality methods (e.g. GloVe+R-GCN vs R-GCN only) due to some untrustworthiness in one of the modalities. However, any fusion under our CAMUE framework always performs better than the corresponding single modality methods. That suggests that our algorithm can bene t from attending to the more reliable modality between text and graph structure, if one particular modality is not trustworthy (e.g. GloVe embedding), and learning not to consider it when making predictions (as we can see in Figure <ref type="figure" target="#fig_3">3</ref>, bottom).</p><p>It is also notable that our CAMUE method outperforms "CAMUE, xed params". These results suggest that adjusting the weight of di erent modalities dynamically yields better performance than xed weights of modalities. Finally, when the text modality is switched to a more accurate BERT embedding, our framework still gives comparable performance to its corresponding simple fusion methods.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6.3.">Case Studies</head><p>User Sub-groups Table <ref type="table" target="#tab_4">3</ref> gives a quantitative analysis when the text encoder is BERT and the graph encoder is R-GCN, for di erent sub-groups of Twitter users we are interested in. In general, graph   structure information contributes the most when comes to bot accounts. One possible explanation for this is the variety of bot accounts on Twitter, such as those for business advertising, political outreach, and sports marketing <ref type="bibr" target="#b21">[22]</ref>. Bots with di erent usage might talk very di erently, however, they may share some common rule-based policies when interacting with humans on Twitter <ref type="bibr" target="#b52">[52,</ref><ref type="bibr" target="#b53">53]</ref>. Graph structure information contributes the second highest when it comes to politicians. This is also not surprising since politicians are generally more inclined to retweet or mention events related to their political parties <ref type="bibr" target="#b20">[21]</ref>. It is also notable that the weight of text information for Republicans is slightly less than that for Democrats. This aligns with the ndings in <ref type="bibr" target="#b54">[54]</ref> that Democrats have a slightly more politically polarized word choice than Republicans. For random users, the weight of text information is the largest, although still not as large as the weight of graph structure information. This could be attributed to the pattern that many random users interact frequently with their non-celebrity families and friends on Twitter, who are more likely to be politically neutral. Table <ref type="table">4</ref> shows some predicted political stances and the main contributing modalities of a group of news agencies. We can see that most of them have more reliable information about graph structure than text information. This is not surprising since most news agency tends to use neutral words to increase their credibility, hence it is hard to gather strong political stances from their text embedding, except for some of them like Fox News and Guardian which are known to use political polarized terms more often <ref type="bibr" target="#b54">[54,</ref><ref type="bibr" target="#b55">55]</ref>. Our framework is able to capture this unique behavior pattern for Fox News and Guardian, meanwhile giving mostly accurate political polarity predictions aligning with results in <ref type="bibr" target="#b54">[54]</ref> and <ref type="foot" target="#foot_3">4</ref> .</p><p>To conclude, we are able to obtain customized user behavior patterns through our multi-modality fusion. Those patterns could provide insights on which modality we should focus on more for di erent types of users, for downstream tasks such as personalized recommendations, social science analysis, or malicious user detection. of the two aspects of user pro les (text and graph structure) on their nal predictions, we do case study by evaluating a subset of users qualitatively to validate our frameworks' capability to give personalized explanations. We selected 9 celebrities among the top 50 most followed Twitter accounts from <ref type="foot" target="#foot_4">5</ref> , whose Twitter accounts appear in the TIMME dataset, as we are not allowed to disclose regular Twitter users' information. We obtain the political polarity predictions of those celebrities and record the percentage of text/graph structure information that contributes to their political polarity predictions (See Figure <ref type="figure" target="#fig_5">5</ref>).</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Selected Celebrities from TIMME Dataset Since there exists no ground truth contribution</head><p>• Elon Musk: Before 2020 (dataset cut-o ), Elon Musk's political views in his tweet text content are often complex. He claimed multiple times not to take the viewpoints in his tweets too seriously <ref type="foot" target="#foot_5">6</ref> . This aligns with the low contribution weight of his texts on his political stand prediction. However, on the graph level, 66.67% of the politicians Elon Musk liked more than one time, have also liked Trump at least once. This is signi cantly larger than the average number in the TIMME dataset (23.67%). This could be a strong reason why our graph structure weight is so high and why we predict Elon Musk to be Republican-leaning. Our prediction is proved correct when in 2022 (which is beyond our dataset cut-o time, 2020 <ref type="bibr" target="#b54">[54]</ref>), Elon Musk claimed that he would vote for Republicans in his tweet <ref type="foot" target="#foot_6">7</ref> . This is a strong indicator that our framework is using correct information. • LeBron James: In his tweets, LeBron James frequently shows his love and respect to Democratic President Obama<ref type="foot" target="#foot_7">8</ref> . Our prediction for him to be Democrat-leaning with a strong text contribution aligns with this observation. • Lady Gaga: Similarly to James, Lady Gaga also expresses explicitly in her tweets about her support of Democratic candidates <ref type="foot" target="#foot_8">9</ref> . Our graph weight is 0, meaning that the text alone is su cient to predict that she is Democrat-learning. Clinton during the 2016 election, a reason why we predict her as Democrat-leaning from the graph structure. Although she supports some republican politicians in 2022 <ref type="foot" target="#foot_10">11</ref> , that is beyond the dataset cuto . • Justin Timberlake: Justin Timberlake has frequent positive interactions with President Obama <ref type="foot" target="#foot_11">12</ref>and rmly supports Hillary Clinton in his tweets <ref type="foot" target="#foot_12">13</ref> , both suggesting that he is Democrat-learning.</p><p>Our model assigns a similar weight to text and graph structure, suggesting that both contribute to that prediction equally. • Taylor Swift: In the case of Taylor Swift, the model fails to give the correct prediction. Her tweets show that she voted for Biden during 2020 <ref type="foot" target="#foot_13">14</ref> , but the prediction is Republican. One reason is that at the graph structure level, the majority of Taylor Swift's followers are classi ed as Republican (67.09%) in the dataset, which can mislead the graph encoder.</p><p>Overall, we conclude that graph structure information is usually more useful when predicting the political polarities of those celebrities. That aligns with the quantitative results in table 3. As we can see, di erent celebrities may have very di erent behavior patterns. Those patterns can be correctly captured and explained by our contribution weight. That con rms the e ectiveness of our framework.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="7.">Conclusion</head><p>In this paper, we investigate some potential limitations of existing fusion methods for text information and graph structure information in user representation learning on social networks. We then propose a contribution-aware multimodal social-media user-embedding with a learnable attention module. Our framework can automatically determine the reliability of text and graph-structure information when learning user-embeddings. It lters out unreliable modalities for speci c users across various downstream tasks. Since our framework is not bound to any speci c model, it has great potential to be adapted to any graph-structure-embedding component and text-embedding component, if a ordable. More importantly, our models can give a score on the reliability of di erent information modalities for each user. That gives our framework great capability for personalized downstream analysis and recommendation. Our work can bring research attention to identifying and removing misleading information modality due to di erences in social network user behavior, and paves the way for more explainable, reliable, and e ective social media user representation learning. Some possible future extensions include adding more modalities other than text and graphs (e.g., image and video data from user's posts). Also, we consider the user identities to be static throughout our analysis, which might not be the case in many scenarios. We can bring time as a factor to produce a multi-modality dynamic social media user embedding. For example, a user's text content may be more trustworthy in the rst few months, and interactive graph structure information becomes more reliable in longer terms.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head></head><label></label><figDesc>Elon's tweet keywords:-Silicon valley -Hollywood -Tech -Clean energy -…...</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_1"><head>Figure 1 :</head><label>1</label><figDesc>Figure 1: When predicting which political party Elon Musk voted in 2020, graph structure-based methods and text-based methods might reach opposite conclusions. Top: A small subset of Elon's activity with other Twitter users. Bottom: Some keywords extracted from Elon's tweets. All of which are extracted before the year 2020.</figDesc><graphic coords="2,312.81,325.90,96.19,96.19" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_2"><head>Figure 2 :</head><label>2</label><figDesc>Figure 2: The architectures of our framework. Top: Simple Fusion method for GNN (baseline), bottom: CAMUE</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_3"><head>Figure 3 :</head><label>3</label><figDesc>Figure 3: Contribution map for TIMME dataset, top: CAMUE(BERT, R-GCN), bottom: CAMUE(GloVe, R-GCN), dark blue stands for higher contribution while white stands for lower contribution, ranging from 0.0 to 1.0.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_4"><head>Figure 4 :</head><label>4</label><figDesc>Figure 4: Contribution map for TwiBot-20-Sub dataset, CAMUE(BERT, R-GCN), dark blue stands for higher contribution while white stands for lower contribution, ranging from 0.0 to 1.0.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_5"><head>Figure 5 :</head><label>5</label><figDesc>Figure 5: A subset of the top 50 followed celebrities. Images licensed under CC-BY or sourced from publicly available Twitter profile images for research purposes under the fair use term from Twitter's Terms of Service. Note that the political polarity prediction comes from our model and may not reflect their actual stances.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_0"><head>Table 1</head><label>1</label><figDesc>Le : A snippet for former U.S. Rep. Ryan Costello, his tweet text content is informative for the political Party prediction, but his Twitter interaction graph data could be misleading. Right: A snippet for U.S. Senator Sheldon Whitehouse, his Twitter interaction graph data is informative for the political Party prediction, but his tweet text content could be misleading.</figDesc><table><row><cell>Name</cell><cell>Ryan Costello</cell><cell>Name</cell><cell>Sheldon Whitehouse</cell></row><row><cell>Ground Truth</cell><cell>Republican</cell><cell>Ground Truth</cell><cell>Democrat</cell></row><row><cell>Party</cell><cell></cell><cell>Party</cell><cell></cell></row><row><cell>Sample Graph</cell><cell>Liked Ben Rhodes (Demo-</cell><cell>Sample Graph</cell><cell>Liked Senate Democrats</cell></row><row><cell>Data</cell><cell>crat) 20 times.</cell><cell>Data</cell><cell>O icial Account 26 times.</cell></row><row><cell></cell><cell>Liked Donald Trump 0</cell><cell></cell><cell>Not following Donald</cell></row><row><cell></cell><cell>time.</cell><cell></cell><cell>Trump.</cell></row><row><cell></cell><cell>Following Mike Quigley</cell><cell></cell><cell>Following Barack Obama.</cell></row><row><cell></cell><cell>(Democrat).</cell><cell>Sample Text</cell><cell>My Republican partner on</cell></row><row><cell>Sample Text</cell><cell>Despite Trump, Iran's elec-</cell><cell>Data</cell><cell>the CARA bill, @SenRob-</cell></row><row><cell>Data</cell><cell>tions &amp; chaotic ME, some</cell><cell></cell><cell>Portman, writes a powerful</cell></row><row><cell></cell><cell>Democrats want to race</cell><cell></cell><cell>editorial on the success of</cell></row><row><cell></cell><cell>ahead with ill-conceived</cell><cell></cell><cell>CARA and CURES (which</cell></row><row><cell></cell><cell>Iran sanctions</cell><cell></cell><cell>provided a needed boost of</cell></row><row><cell></cell><cell>RT @SaeedKD: Iran's peo-</cell><cell></cell><cell>funding to match CARA).</cell></row><row><cell></cell><cell>ple care about elections.</cell><cell></cell><cell>Good move by Trump Ad-</cell></row><row><cell></cell><cell>The so-called democratic</cell><cell></cell><cell>ministration. Cong. @Jim-</cell></row><row><cell></cell><cell>fringe doesn't -by me</cell><cell></cell><cell>Langevin &amp;</cell></row><row><cell>Graph-</cell><cell>Democrat (Wrong)</cell><cell>Graph-</cell><cell>Democrat (Right)</cell></row><row><cell>backbone</cell><cell></cell><cell>backbone</cell><cell></cell></row><row><cell>Prediction</cell><cell></cell><cell>Prediction</cell><cell></cell></row><row><cell>Text-backbone</cell><cell>Republican (Right)</cell><cell></cell><cell></cell></row><row><cell>Prediction</cell><cell></cell><cell></cell><cell></cell></row><row><cell>Fused Model</cell><cell>Democrat (Wrong)</cell><cell></cell><cell></cell></row><row><cell>Prediction</cell><cell></cell><cell></cell><cell></cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_2"><head></head><label></label><figDesc>2 graph + F 2 text ), so the overall complexity is O(L graph EF graph + L graph NF 2 graph + F 2 text ), hence we are not adding extra time complexity.</figDesc><table /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_3"><head>Table 2</head><label>2</label><figDesc>The overall performance (format: accuracy ; f1-score) of CAMUE on di erent social media data sets.</figDesc><table><row><cell>Algorithm</cell><cell cols="2">Encoder Variant Text Graph</cell><cell>TIMME</cell><cell>Data Set TwiBot-20-Sub</cell></row><row><cell>text-only</cell><cell cols="2">GloVe N/A BERT</cell><cell cols="2">0.688 ; 0.681 0.862 ; 0.859</cell><cell>0.565 ; 0.511 0.731 ; 0.722</cell></row><row><cell>link-only</cell><cell>N/A</cell><cell cols="3">MLP R-GCN 0.953 ; 0.953 0.932 ; 0.930</cell><cell>0.707 ; 0.697 0.735 ; 0.728</cell></row><row><cell>simple fusion</cell><cell cols="4">GloVe R-GCN 0.840 ; 0.837 BERT R-GCN 0.959 ; 0.959</cell><cell>0.683 ; 0.675 0.791 ; 0.787</cell></row><row><cell>CAMUE w.</cell><cell>GloVe</cell><cell cols="3">MLP R-GCN 0.952 ; 0.951 0.938 ; 0.937</cell><cell>0.700 ; 0.691 0.734 ; 0.727</cell></row><row><cell>fixed params</cell><cell>BERT</cell><cell cols="3">MLP R-GCN 0.952 ; 0.951 0.940 ; 0.938</cell><cell>0.732 ; 0.722 0.779 ; 0.771</cell></row><row><cell>CAMUE</cell><cell>GloVe BERT</cell><cell cols="3">MLP R-GCN 0.954 ; 0.953 0.945 ; 0.944 MLP 0.935 ; 0.933 R-GCN 0.961 ; 0.960</cell><cell>0.707 ; 0.697 0.738 ; 0.731 0.744 ; 0.738 0.782 ; 0.776</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_4"><head>Table 3 %</head><label>3</label><figDesc>Users whose ↵ &gt; for di erent subgroups</figDesc><table><row><cell cols="2">Subgroup</cell><cell>% Users</cell></row><row><cell cols="2">Democrats</cell><cell>70.9</cell></row><row><cell cols="2">Republicans</cell><cell>76.1</cell></row><row><cell cols="2">Politicians</cell><cell>76.2</cell></row><row><cell cols="2">Non-politicians with Party a iliations</cell><cell>72.4</cell></row><row><cell cols="2">Non-bot random users</cell><cell>61.2</cell></row><row><cell cols="2">Bot accounts</cell><cell>77.3</cell></row><row><cell cols="2">TIMME, aggregated</cell><cell>73.5</cell></row><row><cell cols="2">TwiBot-20-Sub, aggregated</cell><cell>70.1</cell></row><row><cell>Table 4</cell><cell></cell><cell></cell></row><row><cell>Predicted Political Stance of Some News Agencies</cell><cell></cell><cell></cell></row><row><cell>News Agency</cell><cell cols="2">Prediction Text or Graph</cell></row><row><cell>New York Times</cell><cell>D</cell><cell>Graph</cell></row><row><cell>Washington Post</cell><cell>D</cell><cell>Graph</cell></row><row><cell>Wall Street Journal</cell><cell>R</cell><cell>Text</cell></row><row><cell>USA Today</cell><cell>D</cell><cell>Graph</cell></row><row><cell>CNN</cell><cell>D</cell><cell>Graph</cell></row><row><cell>Fox News</cell><cell>R</cell><cell>Text</cell></row><row><cell>Guardian</cell><cell>D</cell><cell>Text</cell></row><row><cell>Associated Press</cell><cell>R</cell><cell>Graph</cell></row><row><cell>US News</cell><cell>D</cell><cell>Graph</cell></row><row><cell>MSNBC</cell><cell>D</cell><cell>Graph</cell></row><row><cell>BBC</cell><cell>R</cell><cell>Graph</cell></row><row><cell>National Review</cell><cell>R</cell><cell>Graph</cell></row><row><cell>Bloomberg</cell><cell>D</cell><cell>graph</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_5"><head></head><label></label><figDesc>• Bill Gates: He usually avoids making explicit statements about whether he supports Democrats or Republicans in his tweets. Although our model predicts him as the Republican, the probability edge is very marginal (11%).• Oprah Winfrey: During the 2016 presidential campaign, she retweeted and mentioned her support for Democratic candidate Hillary Clinton frequently10 , making the graph structure information a strong indicator of her Democratic stance. • Jimmy Fallon: Jimmy Fallon has managed to maintain a sense of political neutrality in his tweets. His text contribution to the nal prediction is 0. Even though the Twitter graph structure indicates that he is Democrat-leaning, we still do not know in real life whether he is a Democrat or Republican. • Katy Perry: Just like Oprah Winfrey, Katy Perry also interacted with and supported Hillary</figDesc><table /></figure>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="1" xml:id="foot_0">https://socialblade.com/twitter/top/100</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="2" xml:id="foot_1">glove.6B.zip from https://nlp.stanford.edu/projects/glove/</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="3" xml:id="foot_2">https://tweeterid.com/</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="4" xml:id="foot_3">https://www.allsides.com/media-bias/ratings</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="5" xml:id="foot_4">https://socialblade.com/twitter/top/100</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="6" xml:id="foot_5">https://twitter.com/elonmusk/status/1007780580396683267</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="7" xml:id="foot_6">https://twitter.com/elonmusk/status/1526997132858822658</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="8" xml:id="foot_7">https://twitter.com/KingJames/status/1290774046964101123, https://twitter.com/KingJames/status/1531837452591042561</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="9" xml:id="foot_8">https://twitter.com/ladygaga/status/1325120729130528768</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="10" xml:id="foot_9">https://twitter.com/Oprah/status/780588770726993920</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="11" xml:id="foot_10">https://twitter.com/katyperry/status/1533246681910628352</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="12" xml:id="foot_11">https://twitter.com/jtimberlake/status/1025867320407846912</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="13" xml:id="foot_12">https://twitter.com/jtimberlake/status/768191007036891136</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="14" xml:id="foot_13">https://twitter.com/taylorswift13/status/1266392274549776387</note>
		</body>
		<back>
			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">Connecting social media to e-commerce: Cold-start product recommendation using microblogging information</title>
		<author>
			<persName><forename type="first">W</forename><forename type="middle">X</forename><surname>Zhao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Li</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>He</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><forename type="middle">Y</forename><surname>Chang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>-R. Wen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Li</surname></persName>
		</author>
		<idno type="DOI">10.1109/TKDE.2015.2508816</idno>
	</analytic>
	<monogr>
		<title level="j">IEEE Transactions on Knowledge and Data Engineering</title>
		<imprint>
			<biblScope unit="volume">28</biblScope>
			<biblScope unit="page" from="1147" to="1159" />
			<date type="published" when="2016">2016</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">Graph neural networks for social recommendation</title>
		<author>
			<persName><forename type="first">W</forename><surname>Fan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Ma</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Q</forename><surname>Li</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>He</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Zhao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Tang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Yin</surname></persName>
		</author>
		<idno type="DOI">10.1145/3308558.3313488</idno>
		<idno>doi:10.1145/3308558.3313488</idno>
		<ptr target="https://doi.org/10.1145/3308558.3313488" />
	</analytic>
	<monogr>
		<title level="m">The World Wide Web Conference</title>
				<meeting><address><addrLine>New York, NY, USA</addrLine></address></meeting>
		<imprint>
			<publisher>Association for Computing Machinery</publisher>
			<date type="published" when="2019">2019</date>
			<biblScope unit="page" from="417" to="426" />
		</imprint>
	</monogr>
	<note>WWW &apos;19</note>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title level="a" type="main">Beyond binary labels: Political ideology prediction of Twitter users</title>
		<author>
			<persName><forename type="first">D</forename><surname>Preo Iuc-Pietro</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Liu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Hopkins</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Ungar</surname></persName>
		</author>
		<idno type="DOI">10.18653/v1/P17-1068</idno>
		<ptr target="https://aclanthology.org/P17-1068.doi:10.18653/v1/P17-1068" />
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics</title>
				<meeting>the 55th Annual Meeting of the Association for Computational Linguistics<address><addrLine>Vancouver, Canada</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2017">2017</date>
			<biblScope unit="volume">1</biblScope>
			<biblScope unit="page" from="729" to="740" />
		</imprint>
	</monogr>
	<note>: Long Papers), Association for Computational Linguistics</note>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">Twitter user representation using weakly supervised graph embedding</title>
		<author>
			<persName><forename type="first">T</forename><surname>Islam</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Goldwasser</surname></persName>
		</author>
		<idno type="DOI">10.1609/icwsm.v16i1.19298</idno>
		<ptr target="https://ojs.aaai.org/index.php/ICWSM/article/view/19298.doi:10.1609/icwsm.v16i1.19298" />
	</analytic>
	<monogr>
		<title level="m">Proceedings of the International AAAI Conference on Web and Social Media</title>
				<meeting>the International AAAI Conference on Web and Social Media</meeting>
		<imprint>
			<date type="published" when="2022">2022</date>
			<biblScope unit="volume">16</biblScope>
			<biblScope unit="page" from="358" to="369" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title level="a" type="main">Retweet-bert: Political leaning detection using language features and information di usion on social networks</title>
		<author>
			<persName><forename type="first">J</forename><surname>Jiang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Ren</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Ferrara</surname></persName>
		</author>
		<idno type="DOI">10.1609/icwsm.v17i1.22160</idno>
		<ptr target="https://ojs.aaai.org/index.php/ICWSM/article/view/22160.doi:10.1609/icwsm.v17i1.22160" />
	</analytic>
	<monogr>
		<title level="m">Proceedings of the International AAAI Conference on Web and Social Media</title>
				<meeting>the International AAAI Conference on Web and Social Media</meeting>
		<imprint>
			<date type="published" when="2023">2023</date>
			<biblScope unit="volume">17</biblScope>
			<biblScope unit="page" from="459" to="469" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">Online human-bot interactions: Detection, estimation, and characterization</title>
		<author>
			<persName><forename type="first">O</forename><surname>Varol</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Ferrara</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Davis</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Menczer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Flammini</surname></persName>
		</author>
		<idno type="DOI">10.1609/icwsm.v11i1.14871</idno>
		<ptr target="https://ojs.aaai.org/index.php/ICWSM/article/view/14871.doi:10.1609/icwsm.v11i1.14871" />
	</analytic>
	<monogr>
		<title level="m">Proceedings of the International AAAI Conference on Web and Social Media</title>
				<meeting>the International AAAI Conference on Web and Social Media</meeting>
		<imprint>
			<date type="published" when="2017">2017</date>
			<biblScope unit="volume">11</biblScope>
			<biblScope unit="page" from="280" to="289" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<analytic>
		<title level="a" type="main">Deep neural networks for bot detection</title>
		<author>
			<persName><forename type="first">S</forename><surname>Kudugunta</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Ferrara</surname></persName>
		</author>
		<idno type="DOI">10.1016/j.ins.2018.08.019</idno>
		<ptr target="https://doi.org/10.1016/j.ins.2018.08.019" />
	</analytic>
	<monogr>
		<title level="j">Information Sciences</title>
		<imprint>
			<biblScope unit="volume">467</biblScope>
			<biblScope unit="page" from="312" to="322" />
			<date type="published" when="2018">2018</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<analytic>
		<title level="a" type="main">Botbuster: Multi-platform bot detection using a mixture of experts</title>
		<author>
			<persName><forename type="first">L</forename><forename type="middle">H X</forename><surname>Ng</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><forename type="middle">M</forename><surname>Carley</surname></persName>
		</author>
		<idno type="DOI">10.1609/icwsm.v17i1.22179</idno>
		<ptr target="https://ojs.aaai.org/index.php/ICWSM/article/view/22179.doi:10.1609/icwsm.v17i1.22179" />
	</analytic>
	<monogr>
		<title level="m">Proceedings of the International AAAI Conference on Web and Social Media</title>
				<meeting>the International AAAI Conference on Web and Social Media</meeting>
		<imprint>
			<date type="published" when="2023">2023</date>
			<biblScope unit="volume">17</biblScope>
			<biblScope unit="page" from="686" to="697" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b8">
	<analytic>
		<title level="a" type="main">Heterogeneous graph network embedding for sentiment analysis on social media</title>
		<author>
			<persName><forename type="first">Z</forename><surname>Jin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Zhao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Liu</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Cognitive Computation</title>
		<imprint>
			<biblScope unit="volume">13</biblScope>
			<biblScope unit="page" from="81" to="95" />
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b9">
	<analytic>
		<title level="a" type="main">Social bots detection via fusing bert and graph convolutional networks</title>
		<author>
			<persName><forename type="first">Q</forename><surname>Guo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Xie</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Li</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Ma</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Zhang</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Symmetry</title>
		<imprint>
			<biblScope unit="volume">14</biblScope>
			<biblScope unit="page">30</biblScope>
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b10">
	<monogr>
		<title level="m" type="main">Socialgcn: An e cient graph convolutional network based model for social recommendation</title>
		<author>
			<persName><forename type="first">L</forename><surname>Wu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Sun</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Hong</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Fu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Wang</surname></persName>
		</author>
		<idno>CoRR abs/1811.02815</idno>
		<ptr target="http://arxiv.org/abs/1811.02815.arXiv:1811.02815" />
		<imprint>
			<date type="published" when="2018">2018</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b11">
	<analytic>
		<title level="a" type="main">Network embedding by fusing multimodal contents and links</title>
		<author>
			<persName><forename type="first">F</forename><surname>Huang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Xu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Li</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Li</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Knowledge-Based Systems</title>
		<imprint>
			<biblScope unit="volume">171</biblScope>
			<biblScope unit="page" from="44" to="55" />
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b12">
	<analytic>
		<title level="a" type="main">Semi-supervised classi cation with graph convolutional networks</title>
		<author>
			<persName><forename type="first">T</forename><forename type="middle">N</forename><surname>Kipf</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Welling</surname></persName>
		</author>
		<ptr target="https://openreview.net/forum?id=SJU4ayYgl" />
	</analytic>
	<monogr>
		<title level="m">International Conference on Learning Representations</title>
				<imprint>
			<date type="published" when="2017">2017</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b13">
	<analytic>
		<title level="a" type="main">Inductive representation learning on large graphs</title>
		<author>
			<persName><forename type="first">W</forename><surname>Hamilton</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Ying</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Leskovec</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Advances in neural information processing systems</title>
		<imprint>
			<biblScope unit="volume">30</biblScope>
			<date type="published" when="2017">2017</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b14">
	<analytic>
		<title level="a" type="main">Graph attention networks</title>
		<author>
			<persName><forename type="first">P</forename><surname>Velickovic</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Cucurull</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Casanova</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Romero</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Lio</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Bengio</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">stat</title>
		<imprint>
			<biblScope unit="volume">1050</biblScope>
			<biblScope unit="page" from="10" to="48550" />
			<date type="published" when="2017">2017</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b15">
	<analytic>
		<title level="a" type="main">Glove: Global vectors for word representation</title>
		<author>
			<persName><forename type="first">J</forename><surname>Pennington</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Socher</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><forename type="middle">D</forename><surname>Manning</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP)</title>
				<meeting>the 2014 conference on empirical methods in natural language processing (EMNLP)</meeting>
		<imprint>
			<date type="published" when="2014">2014</date>
			<biblScope unit="page" from="1532" to="1543" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b16">
	<analytic>
		<title level="a" type="main">BERT: Pre-training of deep bidirectional transformers for language understanding</title>
		<author>
			<persName><forename type="first">J</forename><surname>Devlin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M.-W</forename><surname>Chang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Lee</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Toutanova</surname></persName>
		</author>
		<idno type="DOI">10.18653/v1/N19-1423</idno>
		<ptr target="https://aclanthology.org/N19-1423.doi:10.18653/v1/N19-1423" />
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies</title>
		<title level="s">Long and Short Papers</title>
		<editor>
			<persName><forename type="first">J</forename><surname>Burstein</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">C</forename><surname>Doran</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">T</forename><surname>Solorio</surname></persName>
		</editor>
		<meeting>the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies<address><addrLine>Minneapolis, Minnesota</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2019">2019</date>
			<biblScope unit="volume">1</biblScope>
			<biblScope unit="page" from="4171" to="4186" />
		</imprint>
	</monogr>
	<note>Association for Computational Linguistics</note>
</biblStruct>

<biblStruct xml:id="b17">
	<analytic>
		<title level="a" type="main">Graph neural networks: A review of methods and applications</title>
		<author>
			<persName><forename type="first">J</forename><surname>Zhou</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Cui</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Hu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Yang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Liu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Li</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Sun</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">AI open</title>
		<imprint>
			<biblScope unit="volume">1</biblScope>
			<biblScope unit="page" from="57" to="81" />
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b18">
	<analytic>
		<title level="a" type="main">Vocal minority versus silent majority: Discovering the opionions of the long tail</title>
		<author>
			<persName><forename type="first">E</forename><surname>Mustafaraj</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Finn</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Whitlock</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><forename type="middle">T</forename><surname>Metaxas</surname></persName>
		</author>
		<idno type="DOI">10.1109/PASSAT/SocialCom.2011.188</idno>
	</analytic>
	<monogr>
		<title level="m">IEEE Third International Conference on Privacy, Security, Risk and Trust and 2011 IEEE Third International Conference on Social Computing</title>
				<imprint>
			<date type="published" when="2011">2011. 2011</date>
			<biblScope unit="page" from="103" to="110" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b19">
	<analytic>
		<title level="a" type="main">The digital divide among twitter users and its implications for social research</title>
		<author>
			<persName><forename type="first">G</forename><surname>Blank</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Social Science Computer Review</title>
		<imprint>
			<biblScope unit="volume">35</biblScope>
			<biblScope unit="page" from="679" to="697" />
			<date type="published" when="2017">2017</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b20">
	<analytic>
		<title level="a" type="main">Timme: Twitter ideology-detection via multi-task multirelational embedding</title>
		<author>
			<persName><forename type="first">Z</forename><surname>Xiao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Song</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Xu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Ren</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Sun</surname></persName>
		</author>
		<idno type="DOI">10.1145/3394486.3403275</idno>
		<idno>doi:10.1145/ 3394486.3403275</idno>
		<ptr target="https://doi.org/10.1145/3394486.3403275" />
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery &amp; Data Mining, KDD &apos;20</title>
				<meeting>the 26th ACM SIGKDD International Conference on Knowledge Discovery &amp; Data Mining, KDD &apos;20<address><addrLine>New York, NY, USA</addrLine></address></meeting>
		<imprint>
			<publisher>Association for Computing Machinery</publisher>
			<date type="published" when="2020">2020</date>
			<biblScope unit="page" from="2258" to="2268" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b21">
	<analytic>
		<title level="a" type="main">Twibot-20: A comprehensive twitter bot detection benchmark</title>
		<author>
			<persName><forename type="first">S</forename><surname>Feng</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Wan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Li</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Luo</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 30th ACM International Conference on Information &amp; Knowledge Management</title>
				<meeting>the 30th ACM International Conference on Information &amp; Knowledge Management</meeting>
		<imprint>
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b22">
	<analytic>
		<title level="a" type="main">Characterizing and detecting hateful users on twitter</title>
		<author>
			<persName><forename type="first">M</forename><surname>Ribeiro</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Calais</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Santos</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Almeida</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Meira</surname><genName>Jr</genName></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the International AAAI Conference on Web and Social Media</title>
				<meeting>the International AAAI Conference on Web and Social Media</meeting>
		<imprint>
			<date type="published" when="2018">2018</date>
			<biblScope unit="volume">12</biblScope>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b23">
	<analytic>
		<title level="a" type="main">Anrl: attributed network representation learning via deep neural networks</title>
		<author>
			<persName><forename type="first">Z</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Yang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Bu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Zhou</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Yu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Ester</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Wang</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Ijcai</title>
		<imprint>
			<biblScope unit="volume">18</biblScope>
			<biblScope unit="page" from="3155" to="3161" />
			<date type="published" when="2018">2018</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b24">
	<analytic>
		<title level="a" type="main">Attributed social network embedding</title>
		<author>
			<persName><forename type="first">L</forename><surname>Liao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>He</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T.-S</forename><surname>Chua</surname></persName>
		</author>
		<idno type="DOI">10.1109/TKDE.2018.2819980</idno>
	</analytic>
	<monogr>
		<title level="j">IEEE Transactions on Knowledge and Data Engineering</title>
		<imprint>
			<biblScope unit="volume">30</biblScope>
			<biblScope unit="page" from="2257" to="2270" />
			<date type="published" when="2018">2018</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b25">
	<analytic>
		<title level="a" type="main">User-guided hierarchical attention network for multi-modal social image popularity prediction</title>
		<author>
			<persName><forename type="first">W</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Zha</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 2018 world wide web conference</title>
				<meeting>the 2018 world wide web conference</meeting>
		<imprint>
			<date type="published" when="2018">2018</date>
			<biblScope unit="page" from="1277" to="1286" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b26">
	<analytic>
		<title level="a" type="main">A two-stage embedding model for recommendation with multimodal auxiliary information</title>
		<author>
			<persName><forename type="first">J</forename><surname>Ni</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Huang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Hu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Lin</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Information Sciences</title>
		<imprint>
			<biblScope unit="volume">582</biblScope>
			<biblScope unit="page" from="22" to="37" />
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b27">
	<analytic>
		<title level="a" type="main">Social media-based user embedding: A literature review</title>
		<author>
			<persName><forename type="first">S</forename><surname>Pan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Ding</surname></persName>
		</author>
		<idno type="DOI">10.24963/ijcai.2019/881</idno>
		<ptr target="https://doi.org/10.24963/ijcai.2019/881.doi:10.24963/ijcai.2019/881" />
	</analytic>
	<monogr>
		<title level="m">Proceedings of the Twenty-Eighth International Joint Conference on Arti cial Intelligence, IJCAI-19, International Joint Conferences on Arti cial Intelligence Organization</title>
				<meeting>the Twenty-Eighth International Joint Conference on Arti cial Intelligence, IJCAI-19, International Joint Conferences on Arti cial Intelligence Organization</meeting>
		<imprint>
			<date type="published" when="2019">2019</date>
			<biblScope unit="page" from="6318" to="6324" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b28">
	<monogr>
		<title level="m" type="main">Twitter user geolocation using deep multiview learning</title>
		<author>
			<persName><forename type="first">T</forename><forename type="middle">H</forename><surname>Do</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><forename type="middle">M</forename><surname>Nguyen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Tsiligianni</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Cornelis</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Deligiannis</surname></persName>
		</author>
		<idno>CoRR abs/1805.04612</idno>
		<ptr target="http://arxiv.org/abs/1805.04612.arXiv:1805.04612" />
		<imprint>
			<date type="published" when="2018">2018</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b29">
	<analytic>
		<title level="a" type="main">Volunteerism tendency prediction via harvesting multiple social networks</title>
		<author>
			<persName><forename type="first">X</forename><surname>Song</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z.-Y</forename><surname>Ming</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Nie</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y.-L</forename><surname>Zhao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T.-S</forename><surname>Chua</surname></persName>
		</author>
		<idno type="DOI">10.1145/2832907</idno>
		<ptr target="https://doi.org/10.1145/2832907.doi:10.1145/2832907" />
	</analytic>
	<monogr>
		<title level="j">ACM Trans. Inf. Syst</title>
		<imprint>
			<biblScope unit="volume">34</biblScope>
			<date type="published" when="2016">2016</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b30">
	<analytic>
		<title level="a" type="main">Learning multiview embeddings of twitter users</title>
		<author>
			<persName><forename type="first">A</forename><surname>Benton</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Arora</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Dredze</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics</title>
		<title level="s">Short Papers</title>
		<meeting>the 54th Annual Meeting of the Association for Computational Linguistics</meeting>
		<imprint>
			<date type="published" when="2016">2016</date>
			<biblScope unit="volume">2</biblScope>
			<biblScope unit="page" from="14" to="19" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b31">
	<analytic>
		<title level="a" type="main">Graph convolutional networks for text classi cation</title>
		<author>
			<persName><forename type="first">L</forename><surname>Yao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Mao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Luo</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the AAAI conference on arti cial intelligence</title>
				<meeting>the AAAI conference on arti cial intelligence</meeting>
		<imprint>
			<date type="published" when="2019">2019</date>
			<biblScope unit="volume">33</biblScope>
			<biblScope unit="page" from="7370" to="7377" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b32">
	<monogr>
		<title level="m" type="main">User preference-aware fake news detection</title>
		<author>
			<persName><forename type="first">Y</forename><surname>Dou</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Shu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Xia</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><forename type="middle">S</forename><surname>Yu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Sun</surname></persName>
		</author>
		<idno>CoRR abs/2104.12259</idno>
		<ptr target="https://arxiv.org/abs/2104.12259.arXiv:2104.12259" />
		<imprint>
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b33">
	<analytic>
		<title level="a" type="main">Modeling relational data with graph convolutional networks</title>
		<author>
			<persName><forename type="first">M</forename><surname>Schlichtkrull</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><forename type="middle">N</forename><surname>Kipf</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Bloem</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Van Den</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Berg</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Titov</surname></persName>
		</author>
		<author>
			<persName><surname>Welling</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">The semantic web: 15th international conference, ESWC 2018</title>
				<meeting><address><addrLine>Heraklion, Crete, Greece</addrLine></address></meeting>
		<imprint>
			<publisher>Springer</publisher>
			<date type="published" when="2018">June 3-7, 2018. proceedings 15. 2018</date>
			<biblScope unit="page" from="593" to="607" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b34">
	<analytic>
		<title level="a" type="main">Distributed representations of words and phrases and their compositionality</title>
		<author>
			<persName><forename type="first">T</forename><surname>Mikolov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Sutskever</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Chen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><forename type="middle">S</forename><surname>Corrado</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Dean</surname></persName>
		</author>
		<ptr target="https://proceedings.neurips.cc/paper_les/paper/2013/le/9aa42b31882ec039965f3c4923ce901b-Paper.pdf" />
	</analytic>
	<monogr>
		<title level="m">Advances in Neural Information Processing Systems</title>
				<editor>
			<persName><forename type="first">C</forename><surname>Burges</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">L</forename><surname>Bottou</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">M</forename><surname>Welling</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">Z</forename><surname>Ghahramani</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">K</forename><surname>Weinberger</surname></persName>
		</editor>
		<imprint>
			<publisher>Curran Associates, Inc</publisher>
			<date type="published" when="2013">2013</date>
			<biblScope unit="volume">26</biblScope>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b35">
	<analytic>
		<title level="a" type="main">Deep contextualized word representations</title>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">E</forename><surname>Peters</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Neumann</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Iyyer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Gardner</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Clark</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Lee</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Zettlemoyer</surname></persName>
		</author>
		<idno type="DOI">10.18653/v1/N18-1202</idno>
		<ptr target="https://aclanthology.org/N18-1202.doi:10.18653/v1/N18-1202" />
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies</title>
				<editor>
			<persName><forename type="first">M</forename><surname>Walker</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">H</forename><surname>Ji</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">A</forename><surname>Stent</surname></persName>
		</editor>
		<meeting>the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies<address><addrLine>New Orleans, Louisiana</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2018">2018</date>
			<biblScope unit="volume">1</biblScope>
			<biblScope unit="page" from="2227" to="2237" />
		</imprint>
	</monogr>
	<note>Association for Computational Linguistics</note>
</biblStruct>

<biblStruct xml:id="b36">
	<analytic>
		<title level="a" type="main">Attention is all you need</title>
		<author>
			<persName><forename type="first">A</forename><surname>Vaswani</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Shazeer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Parmar</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Uszkoreit</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Jones</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">N</forename><surname>Gomez</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Kaiser</surname></persName>
		</author>
		<author>
			<persName><surname>Polosukhin</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Advances in neural information processing systems</title>
		<imprint>
			<biblScope unit="volume">30</biblScope>
			<date type="published" when="2017">2017</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b37">
	<analytic>
		<title level="a" type="main">Language models are few-shot learners</title>
		<author>
			<persName><forename type="first">T</forename><surname>Brown</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Mann</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Ryder</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Subbiah</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">D</forename><surname>Kaplan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Dhariwal</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Neelakantan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Shyam</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Sastry</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Askell</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Agarwal</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Herbert-Voss</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Krueger</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Henighan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Child</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Ramesh</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Ziegler</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Wu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Winter</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Hesse</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Chen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Sigler</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Litwin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Gray</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Chess</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Clark</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Berner</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Mccandlish</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Radford</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Sutskever</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Amodei</surname></persName>
		</author>
		<ptr target="https://proceedings.neurips.cc/paper_les/paper/2020/le/1457c0d6bfcb4967418bfb8ac142f64a-Paper.pdf" />
	</analytic>
	<monogr>
		<title level="m">Advances in Neural Information Processing Systems</title>
				<editor>
			<persName><forename type="first">H</forename><surname>Larochelle</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">M</forename><surname>Ranzato</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">R</forename><surname>Hadsell</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">M</forename><surname>Balcan</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">H</forename><surname>Lin</surname></persName>
		</editor>
		<imprint>
			<publisher>Curran Associates, Inc</publisher>
			<date type="published" when="2020">2020</date>
			<biblScope unit="volume">33</biblScope>
			<biblScope unit="page" from="1877" to="1901" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b38">
	<analytic>
		<title level="a" type="main">Training language models to follow instructions with human feedback</title>
		<author>
			<persName><forename type="first">L</forename><surname>Ouyang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Wu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Jiang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Almeida</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><forename type="middle">L</forename><surname>Wainwright</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Mishkin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Agarwal</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Slama</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Ray</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Schulman</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Hilton</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Kelton</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Miller</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Simens</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Askell</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Welinder</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Christiano</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Leike</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Lowe</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 36th International Conference on Neural Information Processing Systems, NIPS &apos;22</title>
				<meeting>the 36th International Conference on Neural Information Processing Systems, NIPS &apos;22<address><addrLine>Red Hook, NY, USA</addrLine></address></meeting>
		<imprint>
			<publisher>Curran Associates Inc</publisher>
			<date type="published" when="2024">2024</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b39">
	<analytic>
		<title level="a" type="main">A review on explainability in multimodal deep neural nets</title>
		<author>
			<persName><forename type="first">G</forename><surname>Joshi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Walambe</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Kotecha</surname></persName>
		</author>
		<idno type="DOI">10.1109/ACCESS.2021.3070212</idno>
	</analytic>
	<monogr>
		<title level="j">IEEE Access</title>
		<imprint>
			<biblScope unit="volume">9</biblScope>
			<biblScope unit="page" from="59800" to="59821" />
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b40">
	<monogr>
		<title level="m" type="main">Point and ask: Incorporating pointing into visual question answering</title>
		<author>
			<persName><forename type="first">A</forename><surname>Mani</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Hinthorn</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Yoo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">O</forename><surname>Russakovsky</surname></persName>
		</author>
		<idno>CoRR abs/2011.13681</idno>
		<ptr target="https://arxiv.org/abs/2011.13681.arXiv:2011.13681" />
		<imprint>
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b41">
	<monogr>
		<title level="m" type="main">Interpreting visual question answering models</title>
		<author>
			<persName><forename type="first">Y</forename><surname>Goyal</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Mohapatra</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Parikh</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Batra</surname></persName>
		</author>
		<idno>CoRR abs/1608.08974</idno>
		<ptr target="http://arxiv.org/abs/1608.08974.arXiv:1608.08974" />
		<imprint>
			<date type="published" when="2016">2016</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b42">
	<analytic>
		<title level="a" type="main">Generation of multimodal justi cation using visual word constraint model for explainable computer-aided diagnosis</title>
		<author>
			<persName><forename type="first">H</forename><surname>Lee</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">T</forename><surname>Kim</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><forename type="middle">M</forename><surname>Ro</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Interpretability of Machine Intelligence in Medical Image Computing and Multimodal Learning for Clinical Decision Support</title>
				<editor>
			<persName><forename type="first">K</forename><surname>Suzuki</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">M</forename><surname>Reyes</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">T</forename><surname>Syeda-Mahmood</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">E</forename><surname>Konukoglu</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">B</forename><surname>Glocker</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">R</forename><surname>Wiest</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">Y</forename><surname>Gur</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">H</forename><surname>Greenspan</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">A</forename><surname>Madabhushi</surname></persName>
		</editor>
		<meeting><address><addrLine>Cham</addrLine></address></meeting>
		<imprint>
			<publisher>Springer International Publishing</publisher>
			<date type="published" when="2019">2019</date>
			<biblScope unit="page" from="21" to="29" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b43">
	<analytic>
		<title level="a" type="main">Algorithmic recourse: from counterfactual explanations to interventions</title>
		<author>
			<persName><forename type="first">A.-H</forename><surname>Karimi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Schölkopf</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Valera</surname></persName>
		</author>
		<idno type="DOI">10.1145/3442188.3445899</idno>
		<idno>doi:10.1145/3442188.3445899</idno>
		<ptr target="https://doi.org/10.1145/3442188.3445899" />
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency, FAccT &apos;21</title>
				<meeting>the 2021 ACM Conference on Fairness, Accountability, and Transparency, FAccT &apos;21<address><addrLine>New York, NY, USA</addrLine></address></meeting>
		<imprint>
			<publisher>Association for Computing Machinery</publisher>
			<date type="published" when="2021">2021</date>
			<biblScope unit="page" from="353" to="362" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b44">
	<monogr>
		<title level="m" type="main">Generating counterfactual explanations with natural language</title>
		<author>
			<persName><forename type="first">L</forename><forename type="middle">A</forename><surname>Hendricks</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Hu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Darrell</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Akata</surname></persName>
		</author>
		<idno>CoRR abs/1806.09809</idno>
		<ptr target="http://arxiv.org/abs/1806.09809.arXiv:1806.09809" />
		<imprint>
			<date type="published" when="2018">2018</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b45">
	<analytic>
		<title level="a" type="main">The impact of explanations on ai competency prediction in vqa</title>
		<author>
			<persName><forename type="first">K</forename><surname>Alipour</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Ray</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Lin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">P</forename><surname>Schulze</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Yao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><forename type="middle">T</forename><surname>Burachas</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">IEEE International Conference on Humanized Computing and Communication with Arti cial Intelligence (HCCAI), IEEE</title>
				<imprint>
			<date type="published" when="2020">2020. 2020</date>
			<biblScope unit="page" from="25" to="32" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b46">
	<analytic>
		<title level="a" type="main">Semantics of the black-box: Can knowledge graphs help make deep learning systems more interpretable and explainable?</title>
		<author>
			<persName><forename type="first">M</forename><surname>Gaur</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Faldu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Sheth</surname></persName>
		</author>
		<idno type="DOI">10.1109/MIC.2020.3031769</idno>
	</analytic>
	<monogr>
		<title level="j">IEEE Internet Computing</title>
		<imprint>
			<biblScope unit="volume">25</biblScope>
			<biblScope unit="page" from="51" to="59" />
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b47">
	<monogr>
		<title level="m" type="main">Time and space complexity of graph convolutional networks</title>
		<author>
			<persName><forename type="first">D</forename><surname>Blakely</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Lanchantin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Qi</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2021-12-31">Dec 31 (2021</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b48">
	<monogr>
		<title/>
		<author>
			<persName><forename type="first">A</forename><surname>Vaswani</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Bengio</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Brevdo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Chollet</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Gomez</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Gouws</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Jones</surname></persName>
		</author>
		<imprint/>
	</monogr>
</biblStruct>

<biblStruct xml:id="b49">
	<analytic>
		<title level="a" type="main">Tensor2tensor for neural machine translation</title>
		<author>
			<persName><surname>Kaiser</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 13th Conference of the Association for Machine Translation in the Americas</title>
				<meeting>the 13th Conference of the Association for Machine Translation in the Americas</meeting>
		<imprint>
			<date type="published" when="2018">2018</date>
			<biblScope unit="volume">1</biblScope>
			<biblScope unit="page" from="193" to="199" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b50">
	<monogr>
		<title level="m" type="main">Bert: Pre-training of deep bidirectional transformers for language understanding</title>
		<author>
			<persName><forename type="first">J</forename><surname>Devlin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M.-W</forename><surname>Chang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Lee</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><forename type="middle">N</forename><surname>Toutanova</surname></persName>
		</author>
		<ptr target="https://arxiv.org/abs/1810.04805" />
		<imprint>
			<date type="published" when="2018">2018</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b51">
	<monogr>
		<title level="m" type="main">The an collection: Designing data and methods for e ective instruction tuning</title>
		<author>
			<persName><forename type="first">S</forename><surname>Longpre</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Hou</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Vu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Webson</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><forename type="middle">W</forename><surname>Chung</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Tay</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Zhou</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Q</forename><forename type="middle">V</forename><surname>Le</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Zoph</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Wei</surname></persName>
		</author>
		<idno type="arXiv">arXiv:2301.13688</idno>
		<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

<biblStruct xml:id="b52">
	<analytic>
		<title level="a" type="main">Detecting social bots on twitter: a literature review</title>
		<author>
			<persName><forename type="first">E</forename><surname>Alothali</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Zaki</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><forename type="middle">A</forename><surname>Mohamed</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">2018 International conference on innovations in information technology (IIT), IEEE</title>
				<imprint>
			<date type="published" when="2018">2018</date>
			<biblScope unit="page" from="175" to="180" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b53">
	<analytic>
		<title level="a" type="main">Rtbust: Exploiting temporal patterns for botnet detection on twitter</title>
		<author>
			<persName><forename type="first">M</forename><surname>Mazza</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Cresci</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Avvenuti</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Quattrociocchi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Tesconi</surname></persName>
		</author>
		<idno type="DOI">10.1145/3292522.3326015</idno>
		<idno>doi:10.1145/3292522.3326015</idno>
		<ptr target="https://doi.org/10.1145/3292522.3326015" />
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 10th ACM Conference on Web Science, WebSci &apos;19</title>
				<meeting>the 10th ACM Conference on Web Science, WebSci &apos;19<address><addrLine>New York, NY, USA</addrLine></address></meeting>
		<imprint>
			<publisher>Association for Computing Machinery</publisher>
			<date type="published" when="2019">2019</date>
			<biblScope unit="page" from="183" to="192" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b54">
	<analytic>
		<title level="a" type="main">Detecting political biases of named entities and hashtags on twitter</title>
		<author>
			<persName><forename type="first">Z</forename><surname>Xiao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Zhu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Zhou</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><forename type="middle">H</forename><surname>Lam</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">A</forename><surname>Porter</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Sun</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">EPJ Data Science</title>
		<imprint>
			<biblScope unit="volume">12</biblScope>
			<biblScope unit="page">20</biblScope>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b55">
	<analytic>
		<title level="a" type="main">Populism, the media, and the mainstreaming of the far right: The guardian coverage of populism as a case study</title>
		<author>
			<persName><forename type="first">K</forename><surname>Brown</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Mondon</surname></persName>
		</author>
		<idno type="DOI">10.1177/0263395720955036</idno>
		<idno>arXiv:</idno>
		<ptr target="https://doi.org/10.1177/0263395720955036" />
	</analytic>
	<monogr>
		<title level="j">Politics</title>
		<imprint>
			<biblScope unit="volume">41</biblScope>
			<biblScope unit="page" from="279" to="295" />
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
