Introduction

Behavioral Tracing of Twitter Accounts

Neel Guha

0 0 Stanford University

\Trolls" - individuals who engage in malicious behavior are a common occurrence within online communities. Yet simply banning accounts associated with trolls is often ine ective as individuals may register new accounts under pseudonyms and resume their activity. In this paper, we demonstrate how this can be addressed through a behavioral trace. Speci cally, we show that by analyzing the posts of an account, we can derive a semantic signature unique to the account's owner. By comparing the signatures of two accounts, we can determine whether they belong to the same user. We validate our techniques on a dataset of Twitter users, and explore di erent properties of our methods.

Introduction

In recent years, online communities have increasingly struggled with the emergence of malicious accounts. These accounts engage in adversarial behavior, often spreading harmful content or attacking other individuals on the platform. This especially prominent on Twitter, where most interactions are public and accounts are not required to correspond to real life identities (unlike Facebook).

Eliminating these malicious accounts is di cult for many reasons. Firstly, the process of banning accounts is resource intensive and arduous, occurring rarely and often too late. Platforms like Twitter often rely on some human validation before banning accounts, resulting in a backlog of agged accounts. Additionally, once an account is banned, it is trivial for the individual to create a new account under a pseudonym. They can resume their malicious behavior through this new account, thus creating a perpetual cycle.

Though the process of banning accounts will likely remain arduous and time consuming due to legal/corporate policies and procedures, it should be possible to prevent banned individuals from creating new accounts, or at least detect when an account may have been created by an individual previously banned.

In theory, this could be achieved through phone veri cation, or IP address blacklisting. However, these can have unintended consequences. Asking for users to validate their accounts with phone numbers may expose individuals who live in oppressive countries and have a legitimate need for privacy. Such measures hamper their ability to use mechanisms like Tor to access Twitter[ 5 ]. Banning accounts on the basis of IP address is also ine ective, as an individual could merely switch to a di erent network to create/access their account. If an individual uses a public machine (in a cybercafe or library), banning on IP address may prevent large numbers of other individuals from accessing their accounts on the same machine.

In this paper, we present behavioral tracing : a method by which accounts created by the same individual can be identi ed and linked on the basis of the content of the accounts. Intuitively, an account's posts represent the topical interests and idiosyncrasies of its owner. Thus, in examining an account's posts, we should be able to derive a signature unique to the account's owner. We refer to this as a trace, and demonstrate it can be constructed. By comparing the traces of two di erent accounts, we can predict whether or not they were owned by the same individual. Applying this in the context presented above, we can use a trace to examine newly created accounts and determine if they resemble a banned account.

Our work is novel in our focus on semantic signatures. Unlike prior work, we formulate an authorship model based primarily on the content produced by a user (as opposed to the user's relations in the network graph, or lexical clues). Rather than constructing user-speci c classi ers to identify accounts belonging to the same user, we introduce a single method applicable to all users. Specifically, we derive a vector space representation for each account (based on the account's post) where the distance between two accounts is indicative of the likelihood that they originate from the same user. 2

Related Work

There is a wealth of literature on di erent techniques for establishing authorship [ 9 ]. [ 18 ] presents methods for authorship identi cation of postings in online communities. The authors experimented with a variety of features (lexical, structural, content based, etc.) and models (neural networks, support vector machines, etc) to establish authorship of di erent posts. Though this is similar to our work, there are several key di erences. The posts analyzed in [ 18 ] were on average over 150 words - well over the 140 character limit of Twitter. Additionally, the goal of the work was to establish the authorship of single posts, and not a collection of posts (corresponding to a single account). Though there has been work on establishing authorship in a Twitter context, it has primarily focused on using lexical and syntactic features [ 11 ][ 2 ]. In our paper, we demonstrate how authorship can also be determine by using extracting a semantic (topic based) signature for every user. To the best of our knowledge, this is a novel approach.

It is important to distinguish prior literature on spam and troll detection from our work. We focus on \linking" Twitter accounts to establish when two accounts were launched by the same user. Though a primary application of this work may be in detecting trolls, it could also be used to detect when a single individual is seeking to in uence a discussion through the creation of multiple accounts. Much of the prior work on spamming and trolling focuses on leveraging network or language characteristics to identify common traits of banned accounts.

There has been signi cant prior work on the role of spammers within social networks like Twitter. Many, like [ 17 ], have focused on characterizing the nature of spamming Twitter accounts. These works have demonstrated the techniques spammers use to promote content, and various approaches that could be used to detect them. It is important to clarify the distinction between spam detection, and the focus of our paper. Spammers primarily use platforms like Twitter to propagate commercial content, and convince users to take certain actions (clicking a link, downloading some software, purchasing a product). Spam accounts tend to be \fake" accounts that aren't tied to any real individual, and are often controlled by bots. In contrast, we focus on \real" accounts that are controlled by real individuals, and represent their interests. These individuals are thus signi cantly less likely to follow the follow the behavioral patterns of fake spam accounts. Our work is partially inspired by our prior work in [ 7 ], which present several methods for identifying web users across di erent browser sessions. Though we incorporate some prior techniques, both our approaches and the nature of the problem are very di erent.

Prior work has also focused on identifying \trolls" or adversaries within social networks [ 15 ] [ 10 ].[ 4 ] presents techniques for detecting trolls within social media networks. However, they assume that trolling individuals create fake troll accounts in addition to their real account. Further, the fake account is followed by the real account, and regularly interacts with the real account. On a limited sample of accounts, they present techniques for identifying the authorship of individual tweets. Our work doesn't make these assumptions. [ 3 ] analyzes \antisocial users" to determine characteristics of banned users. However, their work focuses less on identifying speci er users, and more on analyzing the behavior of banned users on numerous internet forums.

Similarly, there has been signi cant work on de-anonymizing social network users by utilizing information about network relationships [ 8 ] [ 6 ] [ 19 ]. In particular, [ 14 ] demonstrates how anonymized users with accounts on both Flickr and Twitter can be identi ed using graph topology. [ 16 ] attempts to identify Twitter accounts on the basis of browsing histories. By analyzing the t.co URLs visited by a user, they can determine the combination of accounts the user must have been following, which in turn can be used to identify the user's account. However, this approach fails to derive a ngerprint based on the user's interests - a critical contribution of our work. Furthermore, they require the browsing history of a user, a data source that is not often available. 3

Behavioral Trace

In this paper, we introduce behavioral tracing, a method to identify when posts from two Twitter accounts were authored by the same individual. There are many cases where this technique may be applicable. For example, we could apply it to determine when a previously banned user has returned to a platform (i.e. Twitter) and continued their activity under a pseudonym. Alternatively, a user may decide to operate two accounts within a particular community (to reinforce their opinions or create a perception of popularity). Behavioral tracing would allow us to identify such cases.

Our intuition is that a user's tweets are drawn from a xed distribution governed by the user's interests. Given enough tweets from a single user, we can derive some approximation of the original distribution (a user's behavioral trace - also referred to as a user's trace). By comparing the extracted approximations from two accounts, we can determine when two accounts are in fact run by the same user. Thus, if we compared the trace from a banned account to the trace of an active account, we can determine when a user reenters the platform under a pseudonym. In this section we formalize the notion of a behavioral trace, describe how it can be used, and where limitations exist.

This approach also assumes that users maintain a consistent interest distribution and that a signi cant fraction of tweets posted by a user (regardless of the account used) are drawn from this distribution. If this condition were violated - for example, if a user had di erent interest distributions for di erent accounts - then it would be signi cantly harder to extract a meaningful trace. Thus, we assume that when a user has been banned from the platform and returns under a pseudonym, their tweets continue to be drawn from their original distribution. In other words, a user's interests are maintained between both accounts, and their behavior does not signi cantly alter.

There are however, several important limitations to acknowledge. Over time, a user's interests are likely to change. Hence we can expect that in the longer term, a user's interest distribution will gradually shift, making it harder to identify a user. This is something we hope to explore in future work. Additionally, if an individual creates two accounts but uses them for signi cantly di erent aims (professional and personal), the traces extracted won't be similar enough.

We now formalize the notion of a behavioral trace. We imagine a user u having a set of topical interests characterized by a distribution B over all possible interests/topics. Furthermore, we assume that every tweet (ti) authored by u is sampled at random from B. Thus, we should expect that as a user posts more tweets, their collection of tweets grows more representative of their interests (the distribution of ti's should resemble B).

Underlying our approach is the assumption that with high probability, any two users u1 and u2 will have di erent interest distributions (B1 and B2). We reason that individuals tend to be quite diverse in their interests. Though most users undoubtedly share common interests (sports teams, hobbies, etc.), the ways in which individuals process or share information tend to be highly personalized. When examined at a highly granular level, most individuals are distinguishable from one another. Thus, in our approach we seek to construct a trace for each user - an approximation of that user's interesting distribution inferred from their tweets. We treat the trace as a signature, and use it to ngerprint users.

We frame our problem as follows. Given two sets of tweets (T1 and T2) from two di erent accounts, our goal is to extract a trace (referred to as b^1 and b^2) that approximates the interest distribution of each account. If the traces are su ciently similar, then we can determine that they must correspond to the same interest distribution, and that the same user is responsible for writing both sets of tweets. However, if they are su ciently di erent, then we can determine that refer to di erent interest distributions, and that both sets of tweets were written by di erent users. 4

Methodology

We formulate the task as follows. Given a set of tweets from n users, we partition each user's tweets into 2 separate accounts (giving a total of 2n accounts). Our goal is to re-identify the accounts by determining which originate from the same user. 4.1

Approach

We attempt to map each account to a vector based on its interests/behavior/topics. Importantly, we seek to do so in a manner such that accounts corresponding to the same user are close to each other in this vector space. Prior work has demonstrated how word embeddings (e.g. Word2Vec) can capture rich semantic meaning in a way that traditional bag-of-words models cannot [ 13 ]. By constructing models to predict a word from its context (or vice versa), these models allow us to map words/phrases to vectors. Most notably, words that are \close" to each other in the vector space are likely to share similar contexts (and thus meaning).

In this work, we draw on Doc2Vec[ 12 ], an extension of the Word2Vec model that allows us to construct representations of variable length (i.e. documents). Our approach is motivated by the intuition that we can e ectively construct a trace for each user by relying on word embeddings. In doing so, we can derive a vector for each account where the distance between accounts re ects the likelihood that they originate from the same author.

In this work, we collate all tweets from an account and treat the account like a single \document". We run Doc2Vec on the collection of accounts to derive a vector representation for each account [ 1 ]. Rather than compute a similarity score between every pair of accounts, we run k-means clustering to sort the accounts into di erent clusters (on the basis of their inferred vectors). In doing so, we're able to learn the \neighborhood" of an account - other accounts that look similar and are thus more likely to originate from the same user. Relying on this intuition, we thus only calculate a pairwise similarity score for accounts within the same cluster. We assume that accounts in di erent clusters correspond to di erent users. We nd that this is a relatively safe assumption which allows us to signi cantly reduce the run time.

After deriving a location for each account in the vector space, we seek to identify the accounts in its neighborhood that could originate from the same user. For two accounts represented as the vectors ai and aj , we calculate Score(ai; aj ) in the following manner.

Score(ai; aj ) =

Cosine(ai; aj ) We describe this as a \weighted similarity" function, which weighs the similarity of two accounts by how dissimilar they are. It is not su cient to say that two accounts are similar. Rather, we can only be con dent that two accounts correspond to the same user if they are both similar to each other and dissimilar to other accounts. If we have two accounts ai and aj such that ai is similar to aj but both ai and aj are similar to the bulk of the accounts in our data set, we are less con dent that ai and aj originate from the same user. it is probable that ai and aj (and the accounts they are similar to) belong to a mass of users whose behavior is too shallow or generic to discern. Conversely, if ai and aj were similar to each other but di erent from other accounts, we would be signi cantly more con dent that both accounts originated from the same user. Hence, our scoring function is weighted by both account similarity and account dissimilarity.

For calculating the similarity between two accounts ai and aj , we use the cosine similarity metric, a common measure in information retrieval. For two n-dimensional vectors, the cosine similarity is calculated by

Cosine(si; sj ) =

si sj jjsijjjjsj jj

For each account, we deem the account with the highest score that exceeds the threshold to be from the same author. If no accounts have a score above the threshold, then the account in question is deemed not to share an author with any other account in the dataset. If multiple other accounts have scores which exceed the threshold, we only pick the account with the highest score. As we discuss in the next section, this approach is highly exible, allowing us to achieve di erent types of results by varying the cuto score used. 4.2

Evaluation

We measure the success of our approach using the precision-recall framework. Precision is de ned as the proportion of account pairings we identify that are correct.

Precision = jSt \ Spj jSpj where Sp is the set of account pairings we predict and St is the set containing all pairs of accounts that originate from the same user (truth). Recall is de ned as the proportion of same user account pairs that are identi ed by our methodology, or

Recall = jSt \ Spj jStj In the context of our application, precision is the proportion of identi ed account pairings that do correspond to the same user. Recall is the proportion of sameuser account pairings that we do identify.

Using the precision-recall framework to evaluate our approach allows us to modulate the type of result achieved based. Depending on the context in which we're applying the methodology, this can di er. Sometimes perhaps, we may require a strategy that delivers high precision. This would be preferable, for example, if we chose to be conservative in our identi cation of accounts. Alternatively, we may want to ag as many accounts as possible. In this case, we would prefer a strategy which delivered a high recall (even at the cost of precision). 4.3

Baseline

To establish a baseline, we simulate an adversary randomly guessing accounts as pairs. We do this by randomly generating a score between [0; 1] for each pair of accounts. We pick the cuto that maximizes the F1 score and report results at that threshold.

In addition, we o er a more advanced baseline by running K-Means clustering directly on the generated Doc2Vec vectors for each account. Speci cally, we set the number of desired clusters equal to the number of users. If two accounts are contained in the same cluster, we predict those two accounts to originate form the same user. 5

Data

Using the Twitter API, we collected 1,270,999 tweets from 1849 users. Of these, 678,403 were retweets and 592,596 were original tweets by users. Figure 1 shows a cumulative histogram of the number of tweets for every account in the dataset. The vast majority have fewer than 2000 tweets. Figure 2 is a histogram of the proportion of retweets for all accounts (the fraction of an account's tweets that are retweeted). The majority of accounts in our dataset are regularly active, with half posting at least 1.84 times per day.

Given that the focus of this work was on using semantic clues to develop unique identi ers for di erent Twitter accounts, we took care to clean tweets so that the algorithm would not identify accounts on the basis of their network properties. Speci cally, we removed all account handles from the text from every tweet (e.g, \@exampleAccount").

We evaluated our algorithm as follows. We split our dataset of users into two groups - a \training" set and a \testing" set. Within each set, we split each user into two separate accounts (with each account containing half of the user's tweets). We applied our algorithm on the training set, and identi ed the cuto distance at which the F1 score was maximized. We then applied our algorithm to the test set, using the cuto score derived from the training set to predict which accounts belonged to the same user. We repeated this process 25 di erent times (sampling a di erent training and testing set each time).

This particular procedure allows us to justify the nal cuto used to identify accounts. We can imagine that in di erent contexts, a di erent cuto s might be necessary. \Learning" it in this manner will allows us to better approximate an optimal cuto .

Additionally, we experimented with the e ect of retweets on our approach's performance. We thus ran two variations of our strategy. In the rst, we ignored all retweets by accounts, using only \original" tweets to construct account traces. In the second variation, we used all of an account's tweets (including retweets) to construct the trace. Table 5 presents the results of these two versions, along with the baseline performance.

We experimented by varying the number of tweets each account was generated from. We see that as the number of tweets per account increases, the algorithms performance improves (Figure 4). However, we observe that the overall performance of the algorithm appears to level o after roughly 200 tweets.

We also experimented by varying the number of users targeted by our approach. We nd that generally, as the number of users analyzed increases, the algorithm's ability to extract a uniquely identi able ngerprint decreases. The results in Table 5 demonstrate the e ectiveness of our approach. Our algorithm exceeds both the randomized and naive clustering baseline, suggesting that our methods are capable of both successfully constructing unique traces, and using these trace to identify when tweets from two accounts are authored by the same user.

Our techniques demonstrate signi cantly improved results when we use an account's reweets to derive a semantic signature. There are several ways to interpret this result. Its possible that by using an account's retweets, our extracted semantic signature is in uenced by the user's location in the Twitter network. Users are more likely to retweet accounts that they are following/followed by. Thus, when the majority of a user's tweets are retweets, the extracted semantic signature is e ectively a re ection of the network structure surrounding the user.

However, a user's retweets are likely to re ect their interest pro le. Furthermore, every account they retweet is also likely to have an extractable semantic signature (given enough tweets). Thus, we can view the extracted semantic signature for user not solely as their own, but as a composition of the semantic signatures of the accounts they frequently retweet.

We also nd that performance in general improves as the number of tweets sampled for each account increases. Intuitively, this follows. As we gather more tweets from an account, we're able to better approximate the user's pro le, and thus build a better trace. After a while however, there appear to be diminishing returns.

Additionally, we nd that as the number of accounts we run our algorithm on increases, performance tends to decrease. As we grow our sample, we can imagine that accounts grow less distinguishable, and tend towards a more general, \average" interest pro le. In these cases, it becomes hard for us to extract a unique trace for each account. However, the results in Figure 5 suggest that our strategy still nds success for larger samples of users. It's likely that our approach is conducting a variant of \outlier detection", in e ect identifying users who are su ciently di erent from all others.

Additionally, we nd that that when the sample of users is small, the cuto learned on the training set results in poorer performance on the testing set (farther away from the optimal point). When the cuto is large however, we nd that the performance on the training set is comparable to the test set.

The methods we present can be extended beyond the problem posed in this paper. The ability to construct ngerprints for users on the basis of their behavior has wide ranging implications for privacy and security. Broadly speaking, behavioral tracing is applicable in any domain where individuals take actions consistent with a set of interests, habits, or tasks. It could be used for example, to identify someone on the basis of online purchases. Equivalently, it could also be used to disambiguate between multiple individuals using a single account (e.g. on Net ix). At its core, behavioral tracing o ers a way of uniquely identifying individuals on the basis of their behavior. Because behavior is hard to mask or alter, behavioral tracing is especially potent.

In summary, the primary contribution of our work is behavioral tracing, a topical authorship model for Twitter. In framing a user's tweets as samples from their interest distribution, we demonstrate how users can be ngerprinted on the basis of a semantic signature. Validating our approach on real world Twitter data, we demonstrate how it can nd success at identify users across di erent accounts. 7

Acknowledgments

We'd like to thank Anand Shukla, Ramakrishnan Srikant, Dan Boneh, Ramanathan Guha, Mehran Sahami, Lea Kissner, Scott Ellis, and Jonathan Mayer for their advice and guidance on this project.

1. Deep learning with paragraph2vec . https://radimrehurek.com/gensim/models/doc2vec.html.

2. Bhargava , M. , Mehndiratta , P. , and Asawa , K. Stylometric Analysis for Authorship Attribution on Twitter . Springer International Publishing, Cham, 2013 , pp. 37 { 47 .

3. Cheng, J., Danescu-Niculescu-Mizil , C. , and Leskovec , J. Antisocial behavior in online discussion communities . CoRR abs/1504 .00680 ( 2015 ).

4. Galan-Garc

, P., de la Puerta , J. G. , Gomez , C. L. , Santos , I. , and Bringas , P. G. Supervised Machine Learning for the Detection of Troll Pro les in Twitter Social Network: Application to a Real Case of Cyberbullying . Springer International Publishing, Cham, 2014 , pp. 419 { 428 .

5. Gibbs , S. The Problem With Twitters New Abuse Strategy . https://www.theguardian.com/technology/2015/mar/04/twitters-new -bid-toend-online-abuse-could-endanger-dissidents-analysis.

6. Gross , R. , and Acquisti , A. Information revelation and privacy in online social networks . In Proceedings of the 2005 ACM Workshop on Privacy in the Electronic Society (New York, NY, USA, 2005 ), WPES '05 , ACM, pp. 71 { 80 .

7. Guha , N. Semantic identi cation of web browsing sessions . CoRR abs/1704 .03138 ( 2017 ).

8. Hay , M. , Miklau , G. , Jensen , D. , Towsley , D. , and Weis , P. Resisting structural re-identi cation in anonymized social networks . Proc. VLDB Endow . 1 , 1 (Aug. 2008 ), 102 { 114 .

9. Houvardas , J. , and Stamatatos , E. N-Gram Feature Selection for Authorship Identi cation . Springer Berlin Heidelberg, Berlin, Heidelberg, 2006 , pp. 77 { 86 .

10. Kunegis , J. , Lommatzsch , A. , and Bauckhage , C. The slashdot zoo: Mining a social network with negative edges . In Proceedings of the 18th International Conference on World Wide Web (New York, NY, USA, 2009 ), WWW '09 , ACM, pp. 741 { 750 .

11. Layton , R. , Watters , P. , and Dazeley , R. Authorship attribution for twitter in 140 characters or less . In 2010 Second Cybercrime and Trustworthy Computing Workshop (July 2010 ), pp. 1 { 8 .

12. Le , Q. V. , and Mikolov , T. Distributed representations of sentences and documents . CoRR abs/1405 .4053 ( 2014 ).

13. Mikolov , T. , Sutskever , I. , Chen , K. , Corrado , G. , and Dean , J. Distributed representations of words and phrases and their compositionality . CoRR abs/1310 .4546 ( 2013 ).

14. Narayanan , A. , and Shmatikov , V. De-anonymizing social networks . In 2009 30th IEEE Symposium on Security and Privacy (May 2009 ), pp. 173 { 187 .

15. Ortega , F. J. , Troyano , J. A. , Cruz , F. L. , Vallejo , C. G. , and Enrquez , F. Propagation of trust and distrust for the detection of trolls in a social network . Computer Networks 56 , 12 ( 2012 ), 2884 { 2895 .

16. Su , J. , Shukla , A. , Goel , S. , and Narayanan , A. De-anonymizing web browsing data with social networks . In Proceedings of the 26th International Conference on World Wide Web (Republic and Canton of Geneva, Switzerland , 2017 ), WWW '17, International World Wide Web Conferences Steering Committee, pp. 1261 { 1269 .

17. Thomas , K. , Grier , C. , Song , D. , and Paxson , V. Suspended accounts in retrospect: An analysis of twitter spam . In Proceedings of the 2011 ACM SIGCOMM Conference on Internet Measurement Conference (New York, NY, USA, 2011 ), IMC '11, ACM, pp. 243 { 258 .

18. Zheng , R. , Li , J. , Chen , H. , and Huang , Z. A framework for authorship identi cation of online messages: Writing-style features and classi cation techniques . Journal of the American Society for Information Science and Technology 57 , 3 ( 2006 ), 378 { 393 .

19. Zhou , B. , and Pei , J. Preserving privacy in social networks against neighborhood attacks . In 2008 IEEE 24th International Conference on Data Engineering (April 2008 ), pp. 506 { 515 .