<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Temporal Word Embeddings for Dynamic User Pro ling in Twitter</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Breandan Kerin</string-name>
          <email>kerinb@tcd.ie</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Annalina Caputo</string-name>
          <email>annalina.caputo@dcu.ie</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Seamus Lawless</string-name>
          <email>seamus.lawless@scss.tcd.ie</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>ADAPT CENTRE, School of Computer Science &amp; Statistics Trinity College Dublin</institution>
          ,
          <country country="IE">Ireland</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>ADAPT CENTRE, School of Computing Dublin City University</institution>
          ,
          <country country="IE">Ireland</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>The research described in this paper focused on exploring the domain of user pro ling, a nascent and contentious technology which has been steadily attracting increased interest from the research community as its potential for providing personalised digital services is realised. An extensive review of related literature revealed that limited research has been conducted into how temporal aspects of users can be captured using user pro ling techniques. This, coupled with the notable lack of research into the use of word embedding techniques to capture temporal variances in language, revealed an opportunity to extend the Random Indexing word embedding technique such that the interests of users could be modelled based on their use of language. To achieve this, this work concerned itself with extending an existing implementation of Temporal Random Indexing to model Twitter users across multiple granularities of time based on their use of language. The product of this is a novel technique for temporal user pro ling, where a set of vectors is used to describe the evolution of a Twitter user's interests over time through their use of language. The vectors produced were evaluated against a temporal implementation of another state-of-the-art word embedding technique, the Word2Vec Dynamic Independent Skip-gram model, where it was found that Temporal Random Indexing outperformed Word2Vec in the generation of temporal user pro les.</p>
      </abstract>
      <kwd-group>
        <kwd>User Modelling</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>As of the time of writing, it is estimated that approximately 4.36 billion people
of an estimated 7.6 billion globally are connected to the internet3. Some of the
most successful of the 1.9 billion live websites in 2019 are social networking sites4,
hosting Online Social Networking (OSN) platforms such as Twitter, Facebook
and YouTube which allow users to connect digitally and share online content
with each other. In order to capture, maintain and continuously increase the
engagement of such a large user base in an incredibly complex global
environment, the organisations behind these platforms are increasingly employing user
pro ling tactics, where the preferences and interests of the user are modelled,
clustered and learned in order to deliver tailored content directly to them in a
scalable manner.</p>
      <p>
        As described by Kanoje et al., [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] user pro ling is \the process of identifying
the data about a user interest domain... [which] can be used... to understand
more about [the] user and this knowledge can be further used for enhancing the
retrieval for providing satisfaction to the user." It is a contentious technology
from an ethical perspective: Whether it is always used in accordance with
legal and ethical guidelines is highly debatable. Several multinational technology
companies have come under re for leveraging and capitalising upon their users'
data without obtaining their explicit consent and knowledge, with allegations as
serious as implicitly in uencing the US population to elect President Donald J.
Trump to the White House in 20165. Regardless of this, user pro ling has high
potential and desirability from the perspective of improving user experience,
simplifying navigation of the internet through personalisation and allowing relevant
content to be delivered to users more e ciently.
      </p>
      <p>
        The idea of temporally modelling or pro ling users can be motivated by
the observation that an individual user and their data are not static: Their
interests and preferences evolve and vary through time, often following patterns
such as trends, periodicities and spikes. This was demonstrated by
BonnevilleRoussy et al. [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] who found that an individual's musical interests vary through
time and tend to uctuate and change around \particular life changes". Natural
Language Processing (NLP) techniques such as word embeddings6 are already
widely applied in the analysis of users based on textual information. Since the
introduction of the Latent Semantic Analysis (LSA) algorithm [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] in 1990, a
wealth of word embedding techniques has been developed including Random
Indexing (2005) [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ], Word2Vec (2013) [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ], GloVe (2014) [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ], and FastText
(2016) [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ], all of which remain in widespread use in NLP applications.
      </p>
      <p>It is clear that strong user pro ling techniques should be capable of
capturing temporal variances in user characteristics. The idea of capturing temporal
variances whilst modelling users and their interests through time has not been
the subject of a great deal of research at the time of writing this document,
despite the volume of research that exists in both user pro ling and NLP. Thus,
exploring new viable approaches to capturing temporal variations in user pro les
is the primary motivation for this research.
5 https://www.nytimes.com/2018/04/10/us/politics/</p>
      <p>mark-zuckerberg-testimony.html
6 Word embeddings are a means of representing the semantic properties of a
vocabulary of a corpus in a vector space, and are used widely in the area of NLP.</p>
    </sec>
    <sec id="sec-2">
      <title>Related Work</title>
      <sec id="sec-2-1">
        <title>User Pro ling on OSNs</title>
        <p>
          User pro ling using OSN data has been widely explored by the research
community, focusing on problems including personality inference and expertise
inference. Wald et al. [
          <xref ref-type="bibr" rid="ref14">14</xref>
          ] proposed a personality inference model for Facebook users
which built user pro les based on the demographic and text-based information
present on each user's Facebook pro le. Similarly, Matz et al. [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ] proposed an
approach to analysing the personality traits of Facebook users using their
determined OCEAN7 personality traits as a basis to enable mass persuasion for
marketing more e ectively to these users. In the domain of expertise inference,
Xu et. al. [
          <xref ref-type="bibr" rid="ref15">15</xref>
          ] developed a novel topic modelling framework to determine the
expertise of Twitter users by employing an extension of the Latent Dirichlet
Allocation (LDA) algorithm to produce an augmented topic model. In their
research, they observed the importance of the temporal aspects of user pro les in
their future works.
        </p>
        <p>
          When it comes to temporal user pro ling research, the quantity of research
conducted to-date is more limited. In their 2012 paper, Zhang et. al. [17]
proposed a user pro ling system which modelled users of a mobile network both
statically and dynamically using various di erent modelling techniques and
compared the performance using clustering algorithms. In 2017, Liang et. al. [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ]
proposed a dynamic user clustering topic model which generated a vector-based
model of Twitter users and clustered the results based on their cosine similarity.
In both of these works, the temporal modelling approach was found to
outperform its static counterparts.
2.2
        </p>
      </sec>
      <sec id="sec-2-2">
        <title>Temporal Word Embeddings</title>
        <p>
          It is clear that understanding the temporal aspects of words and their semantics
is of major interest to fully understand the users of OSNs. Word embedding
techniques, as a group of widely used NLP techniques, are a strong candidate
to solve this problem. Temporality with respect to word embeddings considers
the way in which word semantics vary over time. There has been an increase in
interest in this variety of word embeddings, stemming from the fact that there is
now an abundance of time variant data sets available from major websites and
application platforms, as well as an increased appreciation of the fact that word
semantics do not remain static over time. Several methods have been proposed
in research which temporally extend static word embedding methods.
{ Jurgens and Keith proposed Temporal Random Indexing (TRI) [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ], a
temporal extension of the Random Indexing word embedding technique. This
7 OCEAN is a set of personality traits used by psychologists to measure and
characterise personality traits. The abbreviation stands for Openness, Conscientiousness,
Extroversion, Agreeableness and Neuroticism.
        </p>
        <p>
          algorithm generates word embeddings as a function of time, enabling
analysis and investigation into the evolution of word meanings over time. This
technique was used in their research for event detection in blog posts.
Subsequently, Basile et. al. [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ] successfully applied TRI to event detection in news
articles.
{ Yao et. al. [
          <xref ref-type="bibr" rid="ref16">16</xref>
          ] proposed a \dynamic statistical model to learn time-aware
word vector representation[s]", building upon the the static Word2Vec model
to \learn time-aware word vector representation[s]" for a New York Times
dataset. Liang et al. [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ] also proposed a temporal extension of the Word2Vec
technique, using it as the basis for a temporal user pro ling system which
modelled model Twitter users' interests through time.
        </p>
        <p>As in the related literature regarding user pro ling, it was concluded by these
researchers that temporal models tended to outperform their static counterparts.
3</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Methods</title>
      <p>It is clear from the research literature that there remain many opportunities to
further investigate potential temporal user pro ling techniques, and that
temporal word embedding techniques show a signi cant promise regarding
understanding the temporal aspects of users interests based on their language usage.
These observations motivated the decision to explore the application of
Temporal Random Indexing to the problem of scalable temporal user pro ling using
OSN data.
3.1</p>
      <sec id="sec-3-1">
        <title>The Temporal Random Indexing Technique</title>
        <p>
          TRI is a word embedding technique, proposed by Jurgens and Keith [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ] as a
temporal extension of the Random Indexing method proposed by Salhgren [
          <xref ref-type="bibr" rid="ref13">13</xref>
          ]. In
contrast to other popular embedding methods, RI-based techniques employ an
implicit dimensionality reduction process which preserves the semantic
information encoded within a term-term co-occurrence matrix. Rather than performing
an explicit reduction upon a co-occurrence matrix as with other techniques8,
Context Vectors are generated incrementally by accumulating Index Vectors
from de ned context windows. Thus, a signi cant advantage of RI-based
techniques is their ability to generate Context Vectors in an online fashion, where the
model can be continuously updated as new information becomes available
without requiring the model to be taken o ine9. In contrast, other state-of-the-art
methods such as Word2Vec and GloVe require training in an o ine fashion: As
new data is acquired, the model requires o ine re-training and re-deployment.
8 Examples of word embedding techniques which use explicit dimensionality reduction
include GloVe and LSA. [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ] [
          <xref ref-type="bibr" rid="ref12">12</xref>
          ]
9 Since the Random Index Vectors remain constant, as new data is acquired the model
can simply add the linear combination of new Index Vectors to the generated Context
Vectors.
        </p>
        <p>For a given corpus C consisting of n documents where document d 2 C, a
vocabulary V of m words can be extracted from C. Given this, the two steps
involved in RI are as follows:
1. Assign a randomly generated Index Vector rp to each word wi in the
vocabulary: wi 2 V .
2. Generate a semantic vector representation svi for each word wi, de ned as
the sum of all random Index Vectors rp assigned to the words that co-occur
with wi in a given context window given by the range j &lt; p &lt; +j for
constant j. This is described by the following equation:
svi = X</p>
        <p>X</p>
        <p>rp
d2C j&lt;p&lt;+j</p>
        <p>In the case of Temporal Random Indexing, before step 1 can be performed
the corpus C must rst be annotated to contain metadata related to the creation
time and date of the data. From this time data, the corpus C can then be split
into k subsets C1, C2, ... Ck, where k is the number of time periods to analyse. To
ensure that the Context Vectors produced by TRI are comparable across multiple
time periods, the Index Vectors remain constant during the entire embedding
generation process across each time period Tk, and the previous time period's
Context Vectors are re-used as the initial state of the new time period's Context
Vectors.</p>
        <p>The major divergence between TRI and RI occurs in step 2 of the process.
In RI, all data within the corpus is used to generate the vectors as time is
considered as static and doesn't vary. In contrast, time is considered dynamic
by TRI and thus a separate vector space is generated for each time period Tk
contained within a document d. The equation governing this second process in
TRI is given as follows:
svi;Tk
= X</p>
        <p>X</p>
        <p>rp
d2C j&lt;p&lt;+j</p>
        <p>Using this approach, it is possible to build vector spaces for each time period
Tk over a given corpus Ck annotated with creation time metadata. Each word
wi contained within the corpus has a unique Context Vector representation for
each time period Tk considered, which are all built upon the same random Index
Vectors. This allows for direct comparison between words within di erent time
periods, since they are generated from a linear combination of the same random
Index Vectors.
3.2</p>
      </sec>
      <sec id="sec-3-2">
        <title>Augmented Temporal Random Indexing for User Pro ling</title>
        <p>In this research, the previously described TRI technique is extended to capture
a vectorised representation of Twitter users based on their language usage, using
the same Index Vectors to generate a Context Vector i.e. a vector representing
the user. Doing this allows for a direct comparison of not only words in the same
vector space across multiple time periods, but also allows for the comparison of
a single user with the shared vocabulary of all users in the dataset using vector
mathematics.</p>
        <p>This information about the user can only be captured by utilising the
metadata related to the Tweets contained in the corpus: Speci cally, the creation time
as well as the user id present in Tweet metadata. The additional step required
to facilitate this involves augmenting the TRI technique to generate these useri
vectors, achieved by collating the text written by a user in a given time period
Tk and leveraging the same random Index Vectors used in generating the word
embeddings.</p>
        <p>Let Tk be a time period which spans between tkstart to tkend , where tkstart is a
time that predates tkend . In order to build the user vector uvi for time period Tk,
we consider all p words contained within the Tweets authored by user i within
the time period Tk, where all Index Vectors, rp contained within the series of
Tweets are summed as follows:
uvi;Tk =</p>
        <p>X rp
p2Tk</p>
        <p>Applying this additional step to the TRI technique, it is possible to construct
a vector space for each time period Tk from a corpus C of n documents containing
metadata related to Tweet creation time and user id. Each user ui has a distinct
vector representation for a given time period in this vector space, given by uviTk ,
which is generated by accumulating the random Index Vectors for each time a
word p was used by a useri in the given time period Tk.
4</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Evaluation</title>
      <p>The evaluation of this research focused on measuring how performant TRI is at
modelling a user and their interests temporally based on their use of language.
A temporal extension of the Word2Vec Skip-Gram model10 is used as a baseline
comparison to the TRI user pro ling method described in this paper. The DISG
model utilises the time period separated data, and independently generates
embeddings for the words used in each time period. To generate a single user vector
using DISG, the word vectors for each word used by a given user for a given time
period are summed and averaged, resulting in a user vector being produced as
the centroid of the word vectors used by that user in a given time period.</p>
      <p>
        For the purposes of this research, the proposed system was evaluated by
following the same approach and collection of Tweets used by Liang et al. [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] The
collection consists of 1; 375 randomly selected Twitter users, along with all the
tweets that they have authored since the beginning of their registration and up
to and including May 31, 2015. The input dataset is built by splitting all the
Tweets into several di erent time periods: month, quarter, and semester. An
10 Speci cally, the Word2Vec Skip-Gram model used is the Dynamic Independent
Skip
      </p>
      <p>
        Gram (DISG) model, which is an inherently temporal model.
automatically generated ground truth is created by extracting all the hashtag
words associated with a speci c period of time. Speci cally, given the collection
of Tweets belonging to each one of the time periods, we extracted all of the
hashtags belonging to the Tweets in each time period, removing the hash character
('#') and lowercasing the hashtag word. Finally, we ranked all of the hashtags
by their tf idf , where the idf was computed on all of the Tweets authored by
a given user, and selected only the top 10 hashtags as ground truth. This is an
important di erence from the method described in [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ], since in the original
approach the hashtags were ranked based on their count in the tweet. We observed
that many time periods contained only one occurrence of each hashtag, leading
to multiple tie situations. Hence, it was decided that ranking the hashtags based
on how speci c they are for a given period was the most e ective approach. The
evaluation uses hashtags as a proxy for the most important concepts shared by a
user in a speci c period time and it aims at assess the capability of the system at
retrieving those relevant hashtags. Since the problem is formulated as a typical
retrieval task, we adopted Mean Average Precision (MAP) and Precision@k as
evaluation metrics.
5
      </p>
    </sec>
    <sec id="sec-5">
      <title>Results</title>
      <p>By applying the techniques described in the Methods and Evaluation sections
to the prepared Twitter dataset, and evaluating them against the generated
ground truths using the trec eval evaluation tool, the results shown in Table 1
were observed.</p>
      <p>TRI
0.0126
0.0142
0.0183</p>
      <p>DISG</p>
      <p>As can be observed from Table 1, although poor MAP and precision values
were obtained for both models, TRI was found to outperform the Word2Vec
DISG model for each of the time periods considered. This is a promising result
for the usability of TRI and further work into improving the data processing,
implementation and evaluation methods is likely to reveal further improvements
in results.</p>
    </sec>
    <sec id="sec-6">
      <title>Conclusion</title>
      <p>This research has demonstrated the potential of the TRI technique as an e
ective approach to temporal user pro ling through its application in analysis of
temporal variations in language use.</p>
      <p>The results obtained clearly demonstrate that the TRI technique outperforms
Word2Vec, a word embedding technique currently considered to be
state-ofthe-art in the NLP domain. This is a signi cant nding, and highlights the
huge potential of applying word embedding techniques in temporal user pro ling
applications. As a domain which has seen little research to-date, and given the
ever-increasing volume of text content generated by users of some of the world's
biggest OSN platforms, the growing importance of strong NLP techniques such
as TRI cannot be overstated.</p>
      <p>It is of the opinion of the authors that with further re nement and
experimentation, a temporal user pro ling system employing TRI could be realised
which is performant and scalable enough to be considered for use in online,
web-scale environments.</p>
      <p>Acknowledgement. This research was conducted with the nancial support
of Science Foundation Ireland under Grant Agreement No. 13/RC/2106 at the
ADAPT SFI Research Centre at Trinity College Dublin and Dublin City
University and by the European Union's Horizon 2020 (EU2020) research and
innovation programme under the Marie Skodowska-Curie grant agreement No.: EU2020
713567. The ADAPT SFI Centre for Digital Media Technology is funded by
Science Foundation Ireland through the SFI Research Centres Programme and is
co-funded under the European Regional Development Fund (ERDF) through
Grant # 13/RC/2106.
17. Zhang, C., Masseglia, F., Zhang, X.: Modeling and clustering users with
evolving pro les in usage streams. In: 2012 19th International
Symposium on Temporal Representation and Reasoning. pp. 133{140 (Sep 2012).
https://doi.org/10.1109/TIME.2012.16</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Basile</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Caputo</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Semeraro</surname>
          </string-name>
          , G.:
          <article-title>Temporal random indexing: a tool for analysing word meaning variations in news</article-title>
          . In: Martinez-Alvarez,
          <string-name>
            <given-names>M.</given-names>
            ,
            <surname>Kruschwitz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>U.</given-names>
            ,
            <surname>Kazai</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            ,
            <surname>Hopfgartner</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            ,
            <surname>Corney</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            ,
            <surname>Campos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            ,
            <surname>Albakour</surname>
          </string-name>
          ,
          <string-name>
            <surname>D</surname>
          </string-name>
          . (eds.)
          <source>Proceedings of the First International Workshop on Recent Trends in News Information Retrieval co-located with 38th European Conference on Information Retrieval (ECIR</source>
          <year>2016</year>
          ), Padua, Italy, March
          <volume>20</volume>
          ,
          <year>2016</year>
          .
          <source>CEUR Workshop Proceedings</source>
          , vol.
          <volume>1568</volume>
          , pp.
          <volume>39</volume>
          {
          <fpage>41</fpage>
          .
          <string-name>
            <surname>CEUR-WS.org</surname>
          </string-name>
          (
          <year>2016</year>
          ), http://ceur-ws.
          <source>org/</source>
          Vol-
          <volume>1568</volume>
          / paper7.pdf
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Bonneville-Roussy</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rentfrow</surname>
            ,
            <given-names>P.J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Xu</surname>
            ,
            <given-names>M.K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Potter</surname>
          </string-name>
          , J.:
          <article-title>Music through the ages: Trends in musical engagement and preferences from adolescence through middle adulthood</article-title>
          . (
          <year>2013</year>
          ). https://doi.org/10.1037/a0033770
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Deerwester</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dumais</surname>
            ,
            <given-names>S.T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Furnas</surname>
            ,
            <given-names>G.W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Landauer</surname>
            ,
            <given-names>T.K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Harshman</surname>
          </string-name>
          , R.:
          <article-title>Indexing by latent semantic analysis</article-title>
          .
          <source>Journal of the American Society for Information Science</source>
          <volume>41</volume>
          (
          <issue>6</issue>
          ),
          <volume>391</volume>
          {
          <fpage>407</fpage>
          (
          <year>1990</year>
          ). https://doi.org/10.1002/(SICI)
          <fpage>1097</fpage>
          -
          <lpage>4571</lpage>
          (
          <issue>199009</issue>
          )41:
          <fpage>6</fpage>
          &lt;
          <fpage>391</fpage>
          :
          <article-title>:AID-ASI1&gt;3.0</article-title>
          .CO;
          <fpage>2</fpage>
          -
          <lpage>9</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Joulin</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Grave</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bojanowski</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mikolov</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          :
          <article-title>Bag of tricks for e cient text classi cation</article-title>
          .
          <source>In: Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume</source>
          <volume>2</volume>
          ,
          <string-name>
            <given-names>Short</given-names>
            <surname>Papers</surname>
          </string-name>
          . pp.
          <volume>427</volume>
          {
          <fpage>431</fpage>
          . Association for Computational Linguistics, Valencia,
          <source>Spain (Apr</source>
          <year>2017</year>
          ), https://www.aclweb.org/anthology/E17-2068
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Jurgens</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Stevens</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          :
          <article-title>Event detection in blogs using temporal random indexing</article-title>
          .
          <source>In: Proceedings of the Workshop on Events in Emerging Text Types</source>
          . pp.
          <volume>9</volume>
          {
          <fpage>16</fpage>
          . Association for Computational Linguistics, Borovets,
          <source>Bulgaria (Sep</source>
          <year>2009</year>
          ), https: //www.aclweb.org/anthology/W09-4302
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Kanoje</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Girase</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mukhopadhyay</surname>
            ,
            <given-names>D.:</given-names>
          </string-name>
          <article-title>User pro ling trends, techniques and applications (</article-title>
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Liang</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          :
          <article-title>Dynamic user pro ling for streams of short texts</article-title>
          . In: McIlraith,
          <string-name>
            <given-names>S.A.</given-names>
            ,
            <surname>Weinberger</surname>
          </string-name>
          ,
          <string-name>
            <surname>K.Q</surname>
          </string-name>
          . (eds.)
          <source>Proceedings of the Thirty-Second AAAI Conference on Arti cial Intelligence</source>
          ,
          <source>(AAAI-18)</source>
          ,
          <article-title>the 30th innovative Applications of Arti cial Intelligence (IAAI-18), and the 8th</article-title>
          <source>AAAI Symposium on Educational Advances in Arti cial Intelligence (EAAI-18)</source>
          , New Orleans, Louisiana, USA, February 2-
          <issue>7</issue>
          ,
          <year>2018</year>
          . pp.
          <volume>5860</volume>
          {
          <fpage>5867</fpage>
          . AAAI Press (
          <year>2018</year>
          ), https://www.aaai.org/ocs/index. php/AAAI/AAAI18/paper/view/16646
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Liang</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ren</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zhao</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ma</surname>
          </string-name>
          , J.,
          <string-name>
            <surname>Yilmaz</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rijke</surname>
          </string-name>
          , M.D.:
          <article-title>Inferring dynamic user interests in streams of short texts for user clustering</article-title>
          .
          <source>ACM Trans. Inf. Syst</source>
          .
          <volume>36</volume>
          (
          <issue>1</issue>
          ),
          <volume>10</volume>
          :1{
          <fpage>10</fpage>
          :37 (Jul
          <year>2017</year>
          ). https://doi.org/10.1145/3072606, http://doi.acm.
          <source>org/10.1145/3072606</source>
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Liang</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zhang</surname>
            ,
            <given-names>X.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ren</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kanoulas</surname>
          </string-name>
          , E.:
          <article-title>Dynamic embeddings for user pro ling in twitter</article-title>
          .
          <source>In: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery &amp;#38; Data Mining</source>
          . pp.
          <volume>1764</volume>
          {
          <fpage>1773</fpage>
          . KDD '18,
          <string-name>
            <surname>ACM</surname>
          </string-name>
          , New York, NY, USA (
          <year>2018</year>
          ). https://doi.org/10.1145/3219819.3220043, http://doi. acm.
          <source>org/10</source>
          .1145/3219819.3220043
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Matz</surname>
            ,
            <given-names>S.C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kosinski</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Nave</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Stillwell</surname>
            ,
            <given-names>D.J.:</given-names>
          </string-name>
          <article-title>Psychological targeting as an e ective approach to digital mass persuasion</article-title>
          .
          <source>Proceedings of the National Academy of Sciences</source>
          <volume>114</volume>
          (
          <issue>48</issue>
          ),
          <volume>12714</volume>
          {
          <fpage>12719</fpage>
          (
          <year>2017</year>
          ). https://doi.org/10.1073/pnas.1710966114, https://www.pnas.org/content/114/ 48/12714
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Mikolov</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chen</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Corrado</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dean</surname>
            ,
            <given-names>J.:</given-names>
          </string-name>
          <article-title>E cient estimation of word representations in vector space (</article-title>
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Pennington</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Socher</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Manning</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          : Glove:
          <article-title>Global vectors for word representation</article-title>
          .
          <source>In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP)</source>
          . pp.
          <volume>1532</volume>
          {
          <issue>1543</issue>
          (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Sahlgren</surname>
            ,
            <given-names>M.:</given-names>
          </string-name>
          <article-title>An introduction to random indexing</article-title>
          .
          <source>In: Methods and applications of semantic indexing workshop at the 7th international conference on terminology and knowledge engineering</source>
          (
          <year>2005</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <surname>Wald</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Khoshgoftaar</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sumner</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          <article-title>: Machine prediction of personality from facebook pro les</article-title>
          .
          <source>In: 2012 IEEE 13th International Conference on Information Reuse Integration (IRI)</source>
          . pp.
          <volume>109</volume>
          {
          <issue>115</issue>
          (Aug
          <year>2012</year>
          ). https://doi.org/10.1109/IRI.
          <year>2012</year>
          .6302998
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <surname>Xu</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ru</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Xiang</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Yang</surname>
            ,
            <given-names>Q.</given-names>
          </string-name>
          :
          <article-title>Discovering user interest on twitter with a modi ed author-topic model</article-title>
          .
          <source>In: 2011 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology</source>
          . vol.
          <volume>1</volume>
          , pp.
          <volume>422</volume>
          {
          <issue>429</issue>
          (Aug
          <year>2011</year>
          ). https://doi.org/10.1109/WI-IAT.
          <year>2011</year>
          .47
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <surname>Yao</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sun</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ding</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rao</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Xiong</surname>
          </string-name>
          , H.:
          <article-title>Dynamic word embeddings for evolving semantic discovery</article-title>
          .
          <source>In: Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining</source>
          . pp.
          <volume>673</volume>
          {
          <fpage>681</fpage>
          . WSDM '18,
          <string-name>
            <surname>ACM</surname>
          </string-name>
          , New York, NY, USA (
          <year>2018</year>
          ). https://doi.org/10.1145/3159652.3159703, http:// doi.acm.
          <source>org/10</source>
          .1145/3159652.3159703
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>