Behavioral Tracing of Twitter Accounts

                                     Neel Guha

                                Stanford University


      Abstract. “Trolls” - individuals who engage in malicious behavior -
      are a common occurrence within online communities. Yet simply ban-
      ning accounts associated with trolls is often ineffective as individuals
      may register new accounts under pseudonyms and resume their activ-
      ity. In this paper, we demonstrate how this can be addressed through
      a behavioral trace. Specifically, we show that by analyzing the posts of
      an account, we can derive a semantic signature unique to the account’s
      owner. By comparing the signatures of two accounts, we can determine
      whether they belong to the same user. We validate our techniques on a
      dataset of Twitter users, and explore different properties of our methods.


1   Introduction

In recent years, online communities have increasingly struggled with the emer-
gence of malicious accounts. These accounts engage in adversarial behavior, of-
ten spreading harmful content or attacking other individuals on the platform.
This especially prominent on Twitter, where most interactions are public and
accounts are not required to correspond to real life identities (unlike Facebook).
    Eliminating these malicious accounts is difficult for many reasons. Firstly, the
process of banning accounts is resource intensive and arduous, occurring rarely
and often too late. Platforms like Twitter often rely on some human validation
before banning accounts, resulting in a backlog of flagged accounts. Additionally,
once an account is banned, it is trivial for the individual to create a new account
under a pseudonym. They can resume their malicious behavior through this new
account, thus creating a perpetual cycle.
    Though the process of banning accounts will likely remain arduous and time
consuming due to legal/corporate policies and procedures, it should be possible
to prevent banned individuals from creating new accounts, or at least detect
when an account may have been created by an individual previously banned.
    In theory, this could be achieved through phone verification, or IP address
blacklisting. However, these can have unintended consequences. Asking for users
to validate their accounts with phone numbers may expose individuals who live
in oppressive countries and have a legitimate need for privacy. Such measures
hamper their ability to use mechanisms like Tor to access Twitter[5]. Banning
accounts on the basis of IP address is also ineffective, as an individual could
merely switch to a different network to create/access their account. If an indi-
vidual uses a public machine (in a cybercafe or library), banning on IP address
may prevent large numbers of other individuals from accessing their accounts on
the same machine.
    In this paper, we present behavioral tracing: a method by which accounts
created by the same individual can be identified and linked on the basis of the
content of the accounts. Intuitively, an account’s posts represent the topical
interests and idiosyncrasies of its owner. Thus, in examining an account’s posts,
we should be able to derive a signature unique to the account’s owner. We refer
to this as a trace, and demonstrate it can be constructed. By comparing the
traces of two different accounts, we can predict whether or not they were owned
by the same individual. Applying this in the context presented above, we can
use a trace to examine newly created accounts and determine if they resemble a
banned account.
    Our work is novel in our focus on semantic signatures. Unlike prior work, we
formulate an authorship model based primarily on the content produced by a
user (as opposed to the user’s relations in the network graph, or lexical clues).
Rather than constructing user-specific classifiers to identify accounts belonging
to the same user, we introduce a single method applicable to all users. Specif-
ically, we derive a vector space representation for each account (based on the
account’s post) where the distance between two accounts is indicative of the
likelihood that they originate from the same user.


2   Related Work

There is a wealth of literature on different techniques for establishing authorship
[9]. [18] presents methods for authorship identification of postings in online com-
munities. The authors experimented with a variety of features (lexical, structural,
content based, etc.) and models (neural networks, support vector machines, etc)
to establish authorship of different posts. Though this is similar to our work,
there are several key differences. The posts analyzed in [18] were on average over
150 words - well over the 140 character limit of Twitter. Additionally, the goal of
the work was to establish the authorship of single posts, and not a collection of
posts (corresponding to a single account). Though there has been work on estab-
lishing authorship in a Twitter context, it has primarily focused on using lexical
and syntactic features [11][2]. In our paper, we demonstrate how authorship can
also be determine by using extracting a semantic (topic based) signature for
every user. To the best of our knowledge, this is a novel approach.
     It is important to distinguish prior literature on spam and troll detection from
our work. We focus on “linking” Twitter accounts to establish when two accounts
were launched by the same user. Though a primary application of this work may
be in detecting trolls, it could also be used to detect when a single individual
is seeking to influence a discussion through the creation of multiple accounts.
Much of the prior work on spamming and trolling focuses on leveraging network
or language characteristics to identify common traits of banned accounts.
     There has been significant prior work on the role of spammers within social
networks like Twitter. Many, like [17], have focused on characterizing the nature
of spamming Twitter accounts. These works have demonstrated the techniques
spammers use to promote content, and various approaches that could be used
to detect them. It is important to clarify the distinction between spam detec-
tion, and the focus of our paper. Spammers primarily use platforms like Twitter
to propagate commercial content, and convince users to take certain actions
(clicking a link, downloading some software, purchasing a product). Spam ac-
counts tend to be “fake” accounts that aren’t tied to any real individual, and
are often controlled by bots. In contrast, we focus on “real” accounts that are
controlled by real individuals, and represent their interests. These individuals
are thus significantly less likely to follow the follow the behavioral patterns of
fake spam accounts. Our work is partially inspired by our prior work in [7], which
present several methods for identifying web users across different browser ses-
sions. Though we incorporate some prior techniques, both our approaches and
the nature of the problem are very different.
    Prior work has also focused on identifying “trolls” or adversaries within so-
cial networks [15] [10].[4] presents techniques for detecting trolls within social
media networks. However, they assume that trolling individuals create fake troll
accounts in addition to their real account. Further, the fake account is followed
by the real account, and regularly interacts with the real account. On a limited
sample of accounts, they present techniques for identifying the authorship of
individual tweets. Our work doesn’t make these assumptions. [3] analyzes “anti-
social users” to determine characteristics of banned users. However, their work
focuses less on identifying specifier users, and more on analyzing the behavior of
banned users on numerous internet forums.
    Similarly, there has been significant work on de-anonymizing social network
users by utilizing information about network relationships [8] [6] [19]. In partic-
ular, [14] demonstrates how anonymized users with accounts on both Flickr and
Twitter can be identified using graph topology. [16] attempts to identify Twitter
accounts on the basis of browsing histories. By analyzing the t.co URLs visited
by a user, they can determine the combination of accounts the user must have
been following, which in turn can be used to identify the user’s account. How-
ever, this approach fails to derive a fingerprint based on the user’s interests - a
critical contribution of our work. Furthermore, they require the browsing history
of a user, a data source that is not often available.


3   Behavioral Trace

In this paper, we introduce behavioral tracing, a method to identify when posts
from two Twitter accounts were authored by the same individual. There are
many cases where this technique may be applicable. For example, we could apply
it to determine when a previously banned user has returned to a platform (i.e.
Twitter) and continued their activity under a pseudonym. Alternatively, a user
may decide to operate two accounts within a particular community (to reinforce
their opinions or create a perception of popularity). Behavioral tracing would
allow us to identify such cases.
      Our intuition is that a user’s tweets are drawn from a fixed distribution
governed by the user’s interests. Given enough tweets from a single user, we can
derive some approximation of the original distribution (a user’s behavioral trace
- also referred to as a user’s trace). By comparing the extracted approximations
from two accounts, we can determine when two accounts are in fact run by the
same user. Thus, if we compared the trace from a banned account to the trace of
an active account, we can determine when a user reenters the platform under a
pseudonym. In this section we formalize the notion of a behavioral trace, describe
how it can be used, and where limitations exist.
      This approach also assumes that users maintain a consistent interest distri-
bution and that a significant fraction of tweets posted by a user (regardless of the
account used) are drawn from this distribution. If this condition were violated
- for example, if a user had different interest distributions for different accounts
- then it would be significantly harder to extract a meaningful trace. Thus, we
assume that when a user has been banned from the platform and returns under
a pseudonym, their tweets continue to be drawn from their original distribution.
In other words, a user’s interests are maintained between both accounts, and
their behavior does not significantly alter.
      There are however, several important limitations to acknowledge. Over time,
a user’s interests are likely to change. Hence we can expect that in the longer
term, a user’s interest distribution will gradually shift, making it harder to iden-
tify a user. This is something we hope to explore in future work. Additionally, if
an individual creates two accounts but uses them for significantly different aims
(professional and personal), the traces extracted won’t be similar enough.
      We now formalize the notion of a behavioral trace. We imagine a user u
having a set of topical interests characterized by a distribution B over all possible
interests/topics. Furthermore, we assume that every tweet (ti ) authored by u is
sampled at random from B. Thus, we should expect that as a user posts more
tweets, their collection of tweets grows more representative of their interests (the
distribution of ti ’s should resemble B).
      Underlying our approach is the assumption that with high probability, any
two users u1 and u2 will have different interest distributions (B1 and B2 ). We
reason that individuals tend to be quite diverse in their interests. Though most
users undoubtedly share common interests (sports teams, hobbies, etc.), the ways
in which individuals process or share information tend to be highly personalized.
When examined at a highly granular level, most individuals are distinguishable
from one another. Thus, in our approach we seek to construct a trace for each
user - an approximation of that user’s interesting distribution inferred from their
tweets. We treat the trace as a signature, and use it to fingerprint users.
      We frame our problem as follows. Given two sets of tweets (T1 and T2 ) from
two different accounts, our goal is to extract a trace (referred to as bˆ1 and
bˆ2 ) that approximates the interest distribution of each account. If the traces
are sufficiently similar, then we can determine that they must correspond to the
same interest distribution, and that the same user is responsible for writing both
sets of tweets. However, if they are sufficiently different, then we can determine
that refer to different interest distributions, and that both sets of tweets were
written by different users.


4     Methodology
We formulate the task as follows. Given a set of tweets from n users, we partition
each user’s tweets into 2 separate accounts (giving a total of 2n accounts). Our
goal is to re-identify the accounts by determining which originate from the same
user.

4.1   Approach
We attempt to map each account to a vector based on its interests/behavior/-
topics. Importantly, we seek to do so in a manner such that accounts corre-
sponding to the same user are close to each other in this vector space. Prior
work has demonstrated how word embeddings (e.g. Word2Vec) can capture rich
semantic meaning in a way that traditional bag-of-words models cannot [13].
By constructing models to predict a word from its context (or vice versa), these
models allow us to map words/phrases to vectors. Most notably, words that are
“close” to each other in the vector space are likely to share similar contexts (and
thus meaning).
    In this work, we draw on Doc2Vec[12], an extension of the Word2Vec model
that allows us to construct representations of variable length (i.e. documents).
Our approach is motivated by the intuition that we can effectively construct
a trace for each user by relying on word embeddings. In doing so, we can de-
rive a vector for each account where the distance between accounts reflects the
likelihood that they originate from the same author.
    In this work, we collate all tweets from an account and treat the account like
a single “document”. We run Doc2Vec on the collection of accounts to derive
a vector representation for each account [1]. Rather than compute a similarity
score between every pair of accounts, we run k-means clustering to sort the
accounts into different clusters (on the basis of their inferred vectors). In doing
so, we’re able to learn the “neighborhood” of an account - other accounts that
look similar and are thus more likely to originate from the same user. Relying
on this intuition, we thus only calculate a pairwise similarity score for accounts
within the same cluster. We assume that accounts in different clusters correspond
to different users. We find that this is a relatively safe assumption which allows
us to significantly reduce the run time.
    After deriving a location for each account in the vector space, we seek to
identify the accounts in its neighborhood that could originate from the same user.
For two accounts represented as the vectors ai and aj , we calculate Score(ai , aj )
in the following manner.
                                           Cosine(ai , aj )
         Score(ai , aj ) = P                                                    (1)
                                        1            P            1
                               k=0                  + k=0
                                   Cosine(ai , ak )         Cosine(aj , ak )
We describe this as a “weighted similarity” function, which weighs the similarity
of two accounts by how dissimilar they are. It is not sufficient to say that two
accounts are similar. Rather, we can only be confident that two accounts corre-
spond to the same user if they are both similar to each other and dissimilar to
other accounts. If we have two accounts ai and aj such that ai is similar to aj
but both ai and aj are similar to the bulk of the accounts in our data set, we are
less confident that ai and aj originate from the same user. it is probable that ai
and aj (and the accounts they are similar to) belong to a mass of users whose
behavior is too shallow or generic to discern. Conversely, if ai and aj were similar
to each other but different from other accounts, we would be significantly more
confident that both accounts originated from the same user. Hence, our scoring
function is weighted by both account similarity and account dissimilarity.
    For calculating the similarity between two accounts ai and aj , we use the
cosine similarity metric, a common measure in information retrieval. For two
n-dimensional vectors, the cosine similarity is calculated by
                                                   si · sj
                            Cosine(si , sj ) =
                                                 ||si ||||sj ||

    For each account, we deem the account with the highest score that exceeds
the threshold to be from the same author. If no accounts have a score above the
threshold, then the account in question is deemed not to share an author with
any other account in the dataset. If multiple other accounts have scores which
exceed the threshold, we only pick the account with the highest score. As we
discuss in the next section, this approach is highly flexible, allowing us to achieve
different types of results by varying the cutoff score used.

4.2   Evaluation
We measure the success of our approach using the precision-recall framework.
Precision is defined as the proportion of account pairings we identify that are
correct.
                                          |St ∩ Sp |
                             Precision =
                                            |Sp |
where Sp is the set of account pairings we predict and St is the set containing all
pairs of accounts that originate from the same user (truth). Recall is defined as
the proportion of same user account pairs that are identified by our methodology,
or

                                           |St ∩ Sp |
                                Recall =
                                              |St |
In the context of our application, precision is the proportion of identified account
pairings that do correspond to the same user. Recall is the proportion of same-
user account pairings that we do identify.
    Using the precision-recall framework to evaluate our approach allows us to
modulate the type of result achieved based. Depending on the context in which
we’re applying the methodology, this can differ. Sometimes perhaps, we may
require a strategy that delivers high precision. This would be preferable, for
example, if we chose to be conservative in our identification of accounts. Alter-
natively, we may want to flag as many accounts as possible. In this case, we would
prefer a strategy which delivered a high recall (even at the cost of precision).


4.3   Baseline

To establish a baseline, we simulate an adversary randomly guessing accounts
as pairs. We do this by randomly generating a score between [0, 1] for each pair
of accounts. We pick the cutoff that maximizes the F1 score and report results
at that threshold.
    In addition, we offer a more advanced baseline by running K-Means clustering
directly on the generated Doc2Vec vectors for each account. Specifically, we set
the number of desired clusters equal to the number of users. If two accounts are
contained in the same cluster, we predict those two accounts to originate form
the same user.


5     Data

Using the Twitter API, we collected 1,270,999 tweets from 1849 users. Of these,
678,403 were retweets and 592,596 were original tweets by users. Figure 1 shows
a cumulative histogram of the number of tweets for every account in the dataset.
The vast majority have fewer than 2000 tweets. Figure 2 is a histogram of the
proportion of retweets for all accounts (the fraction of an account’s tweets that
are retweeted). The majority of accounts in our dataset are regularly active, with
half posting at least 1.84 times per day.
    Given that the focus of this work was on using semantic clues to develop
unique identifiers for different Twitter accounts, we took care to clean tweets
so that the algorithm would not identify accounts on the basis of their network
properties. Specifically, we removed all account handles from the text from every
tweet (e.g, “@exampleAccount”).


          Method                            F1 Score Precision Recall
          Behavioral Tracing (No Retweets) 0.54       0.54     0.54
          Behavioral Tracing (All Tweets) 0.69        0.70     0.69
          Raw K-Means                       0.45      0.45     0.45
          Randomized Baseline               0.00045 0.00086 0.000345
                     Table 1. Results of different approaches


   We evaluated our algorithm as follows. We split our dataset of users into
two groups - a “training” set and a “testing” set. Within each set, we split each
user into two separate accounts (with each account containing half of the user’s
Fig. 1. Cumulative histogram of the number of tweets for every user in our data set


tweets). We applied our algorithm on the training set, and identified the cutoff
distance at which the F1 score was maximized. We then applied our algorithm
to the test set, using the cutoff score derived from the training set to predict
which accounts belonged to the same user. We repeated this process 25 different
times (sampling a different training and testing set each time).
   This particular procedure allows us to justify the final cutoff used to identify
accounts. We can imagine that in different contexts, a different cutoffs might be
necessary. “Learning” it in this manner will allows us to better approximate an
optimal cutoff.
     Additionally, we experimented with the effect of retweets on our approach’s
performance. We thus ran two variations of our strategy. In the first, we ignored
all retweets by accounts, using only “original” tweets to construct account traces.
In the second variation, we used all of an account’s tweets (including retweets)
to construct the trace. Table 5 presents the results of these two versions, along
with the baseline performance.
    We experimented by varying the number of tweets each account was gener-
ated from. We see that as the number of tweets per account increases, the algo-
rithms performance improves (Figure 4). However, we observe that the overall
performance of the algorithm appears to level off after roughly 200 tweets.
   We also experimented by varying the number of users targeted by our ap-
proach. We find that generally, as the number of users analyzed increases, the
algorithm’s ability to extract a uniquely identifiable fingerprint decreases.
Fig. 2. Histogram of the proportion of tweets that were retweets for every user in our
dataset


6    Discussion and Conclusion

The results in Table 5 demonstrate the effectiveness of our approach. Our al-
gorithm exceeds both the randomized and naive clustering baseline, suggesting
that our methods are capable of both successfully constructing unique traces,
and using these trace to identify when tweets from two accounts are authored
by the same user.
    Our techniques demonstrate significantly improved results when we use an
account’s reweets to derive a semantic signature. There are several ways to inter-
pret this result. Its possible that by using an account’s retweets, our extracted
semantic signature is influenced by the user’s location in the Twitter network.
Users are more likely to retweet accounts that they are following/followed by.
Thus, when the majority of a user’s tweets are retweets, the extracted semantic
signature is effectively a reflection of the network structure surrounding the user.
    However, a user’s retweets are likely to reflect their interest profile. Further-
more, every account they retweet is also likely to have an extractable semantic
signature (given enough tweets). Thus, we can view the extracted semantic sig-
nature for user not solely as their own, but as a composition of the semantic
signatures of the accounts they frequently retweet.
    We also find that performance in general improves as the number of tweets
sampled for each account increases. Intuitively, this follows. As we gather more
tweets from an account, we’re able to better approximate the user’s profile, and
Fig. 3. Proportion of accounts grouped into the same cluster for different numbers of
clusters


thus build a better trace. After a while however, there appear to be diminishing
returns.
    Additionally, we find that as the number of accounts we run our algorithm
on increases, performance tends to decrease. As we grow our sample, we can
imagine that accounts grow less distinguishable, and tend towards a more gen-
eral, “average” interest profile. In these cases, it becomes hard for us to extract
a unique trace for each account. However, the results in Figure 5 suggest that
our strategy still finds success for larger samples of users. It’s likely that our ap-
proach is conducting a variant of “outlier detection”, in effect identifying users
who are sufficiently different from all others.
    Additionally, we find that that when the sample of users is small, the cutoff
learned on the training set results in poorer performance on the testing set
(farther away from the optimal point). When the cutoff is large however, we find
that the performance on the training set is comparable to the test set.
    The methods we present can be extended beyond the problem posed in this
paper. The ability to construct fingerprints for users on the basis of their be-
havior has wide ranging implications for privacy and security. Broadly speaking,
behavioral tracing is applicable in any domain where individuals take actions
consistent with a set of interests, habits, or tasks. It could be used for example,
to identify someone on the basis of online purchases. Equivalently, it could also
be used to disambiguate between multiple individuals using a single account (e.g.
on Netflix). At its core, behavioral tracing offers a way of uniquely identifying
individuals on the basis of their behavior. Because behavior is hard to mask or
alter, behavioral tracing is especially potent.
          Fig. 4. Algorithm performance as the tweet sample size varies


   In summary, the primary contribution of our work is behavioral tracing, a
topical authorship model for Twitter. In framing a user’s tweets as samples from
their interest distribution, we demonstrate how users can be fingerprinted on
the basis of a semantic signature. Validating our approach on real world Twitter
data, we demonstrate how it can find success at identify users across different
accounts.


7   Acknowledgments

We’d like to thank Anand Shukla, Ramakrishnan Srikant, Dan Boneh, Ra-
manathan Guha, Mehran Sahami, Lea Kissner, Scott Ellis, and Jonathan Mayer
for their advice and guidance on this project.


References

 1. Deep learning with paragraph2vec. https://radimrehurek.com/gensim/models/doc2vec.html.
 2. Bhargava, M., Mehndiratta, P., and Asawa, K. Stylometric Analysis for
    Authorship Attribution on Twitter. Springer International Publishing, Cham, 2013,
    pp. 37–47.
 3. Cheng, J., Danescu-Niculescu-Mizil, C., and Leskovec, J. Antisocial be-
    havior in online discussion communities. CoRR abs/1504.00680 (2015).
Fig. 5. Algorithm performance (for train and test sets) as the number of users in the
sample varies


 4. Galán-Garcı́a, P., de la Puerta, J. G., Gómez, C. L., Santos, I., and
    Bringas, P. G. Supervised Machine Learning for the Detection of Troll Profiles
    in Twitter Social Network: Application to a Real Case of Cyberbullying. Springer
    International Publishing, Cham, 2014, pp. 419–428.
 5. Gibbs, S.           The Problem With Twitters New Abuse Strategy.
    https://www.theguardian.com/technology/2015/mar/04/twitters-new-bid-to-
    end-online-abuse-could-endanger-dissidents-analysis.
 6. Gross, R., and Acquisti, A. Information revelation and privacy in online social
    networks. In Proceedings of the 2005 ACM Workshop on Privacy in the Electronic
    Society (New York, NY, USA, 2005), WPES ’05, ACM, pp. 71–80.
 7. Guha, N. Semantic identification of web browsing sessions. CoRR abs/1704.03138
    (2017).
 8. Hay, M., Miklau, G., Jensen, D., Towsley, D., and Weis, P. Resisting
    structural re-identification in anonymized social networks. Proc. VLDB Endow. 1,
    1 (Aug. 2008), 102–114.
 9. Houvardas, J., and Stamatatos, E. N-Gram Feature Selection for Authorship
    Identification. Springer Berlin Heidelberg, Berlin, Heidelberg, 2006, pp. 77–86.
10. Kunegis, J., Lommatzsch, A., and Bauckhage, C. The slashdot zoo: Mining
    a social network with negative edges. In Proceedings of the 18th International
    Conference on World Wide Web (New York, NY, USA, 2009), WWW ’09, ACM,
    pp. 741–750.
11. Layton, R., Watters, P., and Dazeley, R. Authorship attribution for twitter
    in 140 characters or less. In 2010 Second Cybercrime and Trustworthy Computing
    Workshop (July 2010), pp. 1–8.
12. Le, Q. V., and Mikolov, T. Distributed representations of sentences and docu-
    ments. CoRR abs/1405.4053 (2014).
13. Mikolov, T., Sutskever, I., Chen, K., Corrado, G., and Dean, J. Dis-
    tributed representations of words and phrases and their compositionality. CoRR
    abs/1310.4546 (2013).
14. Narayanan, A., and Shmatikov, V. De-anonymizing social networks. In 2009
    30th IEEE Symposium on Security and Privacy (May 2009), pp. 173–187.
15. Ortega, F. J., Troyano, J. A., Cruz, F. L., Vallejo, C. G., and Enrquez,
    F. Propagation of trust and distrust for the detection of trolls in a social network.
    Computer Networks 56, 12 (2012), 2884 – 2895.
16. Su, J., Shukla, A., Goel, S., and Narayanan, A. De-anonymizing web brows-
    ing data with social networks. In Proceedings of the 26th International Conference
    on World Wide Web (Republic and Canton of Geneva, Switzerland, 2017), WWW
    ’17, International World Wide Web Conferences Steering Committee, pp. 1261–
    1269.
17. Thomas, K., Grier, C., Song, D., and Paxson, V. Suspended accounts in
    retrospect: An analysis of twitter spam. In Proceedings of the 2011 ACM SIG-
    COMM Conference on Internet Measurement Conference (New York, NY, USA,
    2011), IMC ’11, ACM, pp. 243–258.
18. Zheng, R., Li, J., Chen, H., and Huang, Z. A framework for authorship iden-
    tification of online messages: Writing-style features and classification techniques.
    Journal of the American Society for Information Science and Technology 57, 3
    (2006), 378–393.
19. Zhou, B., and Pei, J. Preserving privacy in social networks against neighborhood
    attacks. In 2008 IEEE 24th International Conference on Data Engineering (April
    2008), pp. 506–515.