=Paper= {{Paper |id=Vol-2563/aics_43 |storemode=property |title=Temporal Word Embeddings for Dynamic User Profiling in Twitter |pdfUrl=https://ceur-ws.org/Vol-2563/aics_43.pdf |volume=Vol-2563 |authors=Breandán Kerin,Annalina Caputo,Seamus Lawless |dblpUrl=https://dblp.org/rec/conf/aics/KerinCL19 }} ==Temporal Word Embeddings for Dynamic User Profiling in Twitter== https://ceur-ws.org/Vol-2563/aics_43.pdf
  Temporal Word Embeddings for Dynamic User
             Profiling in Twitter

            Breandán Kerin1 , Annalina Caputo2 , and Séamus Lawless1
            1
                ADAPT CENTRE, School of Computer Science & Statistics
                          Trinity College Dublin, Ireland
                   kerinb@tcd.ie,seamus.lawless@scss.tcd.ie
                     2
                       ADAPT CENTRE, School of Computing
                          Dublin City University, Ireland
                            annalina.caputo@dcu.ie



        Abstract. The research described in this paper focused on exploring
        the domain of user profiling, a nascent and contentious technology which
        has been steadily attracting increased interest from the research commu-
        nity as its potential for providing personalised digital services is realised.
        An extensive review of related literature revealed that limited research
        has been conducted into how temporal aspects of users can be captured
        using user profiling techniques. This, coupled with the notable lack of
        research into the use of word embedding techniques to capture temporal
        variances in language, revealed an opportunity to extend the Random In-
        dexing word embedding technique such that the interests of users could
        be modelled based on their use of language. To achieve this, this work
        concerned itself with extending an existing implementation of Temporal
        Random Indexing to model Twitter users across multiple granularities of
        time based on their use of language. The product of this is a novel tech-
        nique for temporal user profiling, where a set of vectors is used to describe
        the evolution of a Twitter user’s interests over time through their use of
        language. The vectors produced were evaluated against a temporal im-
        plementation of another state-of-the-art word embedding technique, the
        Word2Vec Dynamic Independent Skip-gram model, where it was found
        that Temporal Random Indexing outperformed Word2Vec in the gener-
        ation of temporal user profiles.

        Keywords: User Modelling · Word Embeddings · Random Indexing.


1      Introduction

As of the time of writing, it is estimated that approximately 4.36 billion people
of an estimated 7.6 billion globally are connected to the internet3 . Some of the
most successful of the 1.9 billion live websites in 2019 are social networking sites4 ,
hosting Online Social Networking (OSN) platforms such as Twitter, Facebook
 3
     http://worldpopulationreview.com/
 4
     http://www.internetlivestats.com/internet-users/




Copyright © 2019 for this paper by its authors. Use permitted under Creative Commons License
Attribution 4.0 International (CC BY 4.0).
and YouTube which allow users to connect digitally and share online content
with each other. In order to capture, maintain and continuously increase the
engagement of such a large user base in an incredibly complex global environ-
ment, the organisations behind these platforms are increasingly employing user
profiling tactics, where the preferences and interests of the user are modelled,
clustered and learned in order to deliver tailored content directly to them in a
scalable manner.
    As described by Kanoje et al., [6] user profiling is “the process of identifying
the data about a user interest domain... [which] can be used... to understand
more about [the] user and this knowledge can be further used for enhancing the
retrieval for providing satisfaction to the user.” It is a contentious technology
from an ethical perspective: Whether it is always used in accordance with le-
gal and ethical guidelines is highly debatable. Several multinational technology
companies have come under fire for leveraging and capitalising upon their users’
data without obtaining their explicit consent and knowledge, with allegations as
serious as implicitly influencing the US population to elect President Donald J.
Trump to the White House in 20165 . Regardless of this, user profiling has high
potential and desirability from the perspective of improving user experience, sim-
plifying navigation of the internet through personalisation and allowing relevant
content to be delivered to users more efficiently.
    The idea of temporally modelling or profiling users can be motivated by
the observation that an individual user and their data are not static: Their
interests and preferences evolve and vary through time, often following patterns
such as trends, periodicities and spikes. This was demonstrated by Bonneville-
Roussy et al. [2] who found that an individual’s musical interests vary through
time and tend to fluctuate and change around “particular life changes”. Natural
Language Processing (NLP) techniques such as word embeddings6 are already
widely applied in the analysis of users based on textual information. Since the
introduction of the Latent Semantic Analysis (LSA) algorithm [3] in 1990, a
wealth of word embedding techniques has been developed including Random
Indexing (2005) [13], Word2Vec (2013) [11], GloVe (2014) [12], and FastText
(2016) [4], all of which remain in widespread use in NLP applications.
    It is clear that strong user profiling techniques should be capable of captur-
ing temporal variances in user characteristics. The idea of capturing temporal
variances whilst modelling users and their interests through time has not been
the subject of a great deal of research at the time of writing this document,
despite the volume of research that exists in both user profiling and NLP. Thus,
exploring new viable approaches to capturing temporal variations in user profiles
is the primary motivation for this research.


5
  https://www.nytimes.com/2018/04/10/us/politics/
  mark-zuckerberg-testimony.html
6
  Word embeddings are a means of representing the semantic properties of a vocabu-
  lary of a corpus in a vector space, and are used widely in the area of NLP.
2     Related Work

2.1    User Profiling on OSNs

User profiling using OSN data has been widely explored by the research commu-
nity, focusing on problems including personality inference and expertise infer-
ence. Wald et al. [14] proposed a personality inference model for Facebook users
which built user profiles based on the demographic and text-based information
present on each user’s Facebook profile. Similarly, Matz et al. [10] proposed an
approach to analysing the personality traits of Facebook users using their de-
termined OCEAN7 personality traits as a basis to enable mass persuasion for
marketing more effectively to these users. In the domain of expertise inference,
Xu et. al. [15] developed a novel topic modelling framework to determine the
expertise of Twitter users by employing an extension of the Latent Dirichlet
Allocation (LDA) algorithm to produce an augmented topic model. In their re-
search, they observed the importance of the temporal aspects of user profiles in
their future works.
    When it comes to temporal user profiling research, the quantity of research
conducted to-date is more limited. In their 2012 paper, Zhang et. al. [17] pro-
posed a user profiling system which modelled users of a mobile network both
statically and dynamically using various different modelling techniques and com-
pared the performance using clustering algorithms. In 2017, Liang et. al. [8]
proposed a dynamic user clustering topic model which generated a vector-based
model of Twitter users and clustered the results based on their cosine similarity.
In both of these works, the temporal modelling approach was found to outper-
form its static counterparts.


2.2    Temporal Word Embeddings

It is clear that understanding the temporal aspects of words and their semantics
is of major interest to fully understand the users of OSNs. Word embedding
techniques, as a group of widely used NLP techniques, are a strong candidate
to solve this problem. Temporality with respect to word embeddings considers
the way in which word semantics vary over time. There has been an increase in
interest in this variety of word embeddings, stemming from the fact that there is
now an abundance of time variant data sets available from major websites and
application platforms, as well as an increased appreciation of the fact that word
semantics do not remain static over time. Several methods have been proposed
in research which temporally extend static word embedding methods.

 – Jurgens and Keith proposed Temporal Random Indexing (TRI) [5], a tem-
   poral extension of the Random Indexing word embedding technique. This
7
    OCEAN is a set of personality traits used by psychologists to measure and charac-
    terise personality traits. The abbreviation stands for Openness, Conscientiousness,
    Extroversion, Agreeableness and Neuroticism.
   algorithm generates word embeddings as a function of time, enabling anal-
   ysis and investigation into the evolution of word meanings over time. This
   technique was used in their research for event detection in blog posts. Subse-
   quently, Basile et. al. [1] successfully applied TRI to event detection in news
   articles.
 – Yao et. al. [16] proposed a “dynamic statistical model to learn time-aware
   word vector representation[s]”, building upon the the static Word2Vec model
   to “learn time-aware word vector representation[s]” for a New York Times
   dataset. Liang et al. [9] also proposed a temporal extension of the Word2Vec
   technique, using it as the basis for a temporal user profiling system which
   modelled model Twitter users’ interests through time.

As in the related literature regarding user profiling, it was concluded by these
researchers that temporal models tended to outperform their static counterparts.


3     Methods

It is clear from the research literature that there remain many opportunities to
further investigate potential temporal user profiling techniques, and that tem-
poral word embedding techniques show a significant promise regarding under-
standing the temporal aspects of users interests based on their language usage.
These observations motivated the decision to explore the application of Tempo-
ral Random Indexing to the problem of scalable temporal user profiling using
OSN data.


3.1   The Temporal Random Indexing Technique

TRI is a word embedding technique, proposed by Jurgens and Keith [5] as a tem-
poral extension of the Random Indexing method proposed by Salhgren [13]. In
contrast to other popular embedding methods, RI-based techniques employ an
implicit dimensionality reduction process which preserves the semantic informa-
tion encoded within a term-term co-occurrence matrix. Rather than performing
an explicit reduction upon a co-occurrence matrix as with other techniques8 ,
Context Vectors are generated incrementally by accumulating Index Vectors
from defined context windows. Thus, a significant advantage of RI-based tech-
niques is their ability to generate Context Vectors in an online fashion, where the
model can be continuously updated as new information becomes available with-
out requiring the model to be taken offline9 . In contrast, other state-of-the-art
methods such as Word2Vec and GloVe require training in an offline fashion: As
new data is acquired, the model requires offline re-training and re-deployment.
8
  Examples of word embedding techniques which use explicit dimensionality reduction
  include GloVe and LSA. [3] [12]
9
  Since the Random Index Vectors remain constant, as new data is acquired the model
  can simply add the linear combination of new Index Vectors to the generated Context
  Vectors.
   For a given corpus C consisting of n documents where document d ∈ C, a
vocabulary V of m words can be extracted from C. Given this, the two steps
involved in RI are as follows:
 1. Assign a randomly generated Index Vector rp to each word wi in the vocab-
    ulary: wi ∈ V .
 2. Generate a semantic vector representation svi for each word wi , defined as
    the sum of all random Index Vectors rp assigned to the words that co-occur
    with wi in a given context window given by the range −j < p < +j for
    constant j. This is described by the following equation:
                                       X X
                               svi =                rp
                                         d∈C −j