-

Workshop, Glasgow, Scotland

StreamGrid: Summarization of Large Scale Events using Topic Modelling and Temporal Analysis

Emmanouil Schinas

manosetro@iti.gr 0

Yiannis Kompatsiaris

ikom@iti.gr 1

Symeon Papadopoulos

papadop@iti.gr 1

Pericles A. Mitkas

mitkas@eng.auth.gr 0 0 1Dept. of Electrical & Computer Engineering, Aristotle University of Thessaloniki, 2Information Technologies Institute, Centre for Research & Technology Hellas 1 Information Technologies Institute, Centre for Research & Technology Hellas , Thessaloniki , Greece

2014

0 1 04

Due to the increasing popularity of microblogging platforms, the amount of messages related to large scale public events reach impressive levels. Although such messages can be quite informative regarding di erent aspects of the main event, there is a lot of spam and redundancy that makes it challenging to extract insights regarding the event of interest. In this work we describe a summarization framework that captures the important moments of an event by using a combination of topic modelling and bursty activity detection. We propose a data structure named StreamGrid, that maintains the information of active topics in regular time intervals at several scales. This structure is used for the creation of concise summaries for any time interval. Finally, the evaluation on a large Twitter dataset around the Sundance Film Festival demonstrates the potential of the proposed framework.

Introduction

Due to their increasing popularity, micro-blogging platforms, and especially Twitter, have evolved into a powerful means for getting connected with real world events. In large scale public events, ranging from sport events, such as football matches, to political events and festivals, the users that are somehow involved in the event use social media to share their experiences and express their opinions. In many cases, these messages are quite informative and provide real-time coverage of the ongoing event and may be correlated with important variables related to the event, e.g. lm ratings [ 13 ]. Thus, not surprisingly, the amount of eventrelated messages has reached impressive levels [ 1 ].

However, a signi cant percentage of micro-blogging messages can be considered as non-informative or spam. This fact combined with the huge number of messages, makes it very challenging for interested stakeholders, such as event organizers and enthusiasts, to monitor the evolution of the event and understand its important moments. In case of long-running events, this becomes even more di cult due to the existence of numerous sub-events occurring within the main event. Such sub-events have di erent durations and impact on the main event. In addition, a large portion of the messages contain conversations about other entities of interest associated with the event. In other words, an event-related stream of messages is quite diverse and noisy, with di erent associated topics, conversations among users, and spam messages. Thus, there is a profound need for event-based summarization methods that can produce concise multi-document summaries for any time interval of the event, covering its main aspects.

The framework we propose in this work aims to create topic-based summaries of large-scale events for arbitrary time durations by applying post-analysis on the stream of event related messages. First, we apply LDA topic modelling to discover the underlying aspects of the event. To support summarization, we create a 2D-array structure named StreamGrid. This maintains the information of each topic at each time interval. To create the grid we assign messages to the detected topics and divide topic-associated messages using regular time intervals. Next, we create timelines for the set of topics and use them to detect the set of active topics at each time interval by nding the bursty activity periods in them. A greedy algorithm is used to obtain a set of representative messages that maximize the coverage of the event by selecting the maximum possible number of active topics and minimize redundancy across messages at the same time. Finally, to demonstrate the potential of the proposed framework, we perform an experimental evaluation on a real-world dataset consisting of tweets around the Sundance Film Festival 2013.

The paper is organized as follows. Section 2 contains a brief survey of related methods and applications. Section 3 describes in detail the proposed framework. Section 4 presents an experimental case study on the Sundance 2013 dataset. We conclude the paper and describe future work in Section 5. 2

Related Work

A substantial body of work exists in literature on the problem of micro-blogging summarization. A notable method for multi-document summarization relies on the computation of centroids based on content. Namely, the summary of a set of documents, represented as tf idf vectors, consists of those documents that are closest to the centroid of the set [ 12 ]. Shari et al. [ 15 ] propose a method for the generation of a single sentence from a set of tweets, by using a graph-based technique. Nichols et al. [ 11 ] describe an algorithm that generates a summary of sports events. They use a peak detection algorithm to detect important moments and then apply the method of [ 15 ] to extract summary sentences from the tweets around these moments. The work of [ 8 ] uses linear-programming optimization to select summary sentences from tweets related to trending topics. Notably, they also make use of linked Web content to extend the original sources of information.

Shen et al. [ 16 ] present a participant-based approach for event summarization. A mixture model is proposed to detect sub-events at participant level, and the tf idf centroid approach is used to create a summary of each sub-event. Similarly, Chakrabarti and Punera [ 4 ] propose the use of a Hidden Markov Model to obtain a time-based segmentation of the stream that captures the underlying sub-events. Alonso and Shiells [ 2 ] create timelines for football games, annotated with the key aspects of the event. Dork et al. [ 5 ] propose an interface for large scale events that employs several visualizations for interactive presentation of the event.

A di erent problem is tackled by Wang et al. [ 19 ]. Unlike other methods, that method aims to create a storyline from a set of event-related objects. A multiview graph of objects is constructed, where the two type of edges capture the contextual similarity and the temporal proximity among objects. Then a timeordered sequence of important objects is obtained via graph optimization. Lin et al. [ 7 ] extends the previous work to generate storylines from a set of micro-blog messages for arbitrary queries. To achieve this, they use query expansion techniques to retrieve the queryrelated messages and then apply the same method as [ 19 ] to create the storyline.

Another approach for summarizing evolving tweet streams is proposed by the Sumblr framework [ 17 ]. This relies on an online clustering algorithm for tweets and on maintaining distilled statistics of the clusters at speci c time snapshots using a structure, named Pyramidal Time Frame. Then, a summarization technique is employed for generating summaries of arbitrary time durations based on the LexRank method [ 6 ]. 3

Proposed Method

An overview of the proposed method is illustrated in Figure 1. The proposed framework processes a stream of online messages around an event and extracts informative summaries for any requested time duration. In other words, the proposed framework identi es a set of topics and then selects related messages based on their importance. 3.1

Topic Modelling

Topic modelling is based on the assumption that each document can be described as a random mixture of topics and each topic as a multinomial distribution over terms. In our approach we employ topic modelling by using the well known Latent Dirichlet Allocation model [ 3 ] across the whole stream of messages. This process is applied after the end of the event, when all the messages are available. However, topic modelling in micro-blog messages is problematic due to the short length of their text. To overcome this, a lot of approaches have been proposed. To avoid changes on standard LDA, a relative simple solution is message pooling, in which messages are pooled together to form larger documents. We experimented with four methods of message pooling in a similar way as [ 10 ]. First, we tried to merge messages using constant length time bins. Then, we merged messages of the same author to form a single document. As a third option, we pooled messages together based on their hashtags. Messages with multiple hastags assigned to multiple documents and messages without any hashtag were assigned to the document with the highest textual similarity. As a fourth option, we used a 1NN clustering algorithm to cluster messages with high textual similarity. Each of those clusters formed a single document for the LDA method. In addition, for all of the pooling methods we ltered out messages having only one term and removed standard stopwords to discard the non informative terms.

Another drawback of LDA is that the number of topics must be de ned; obviously, the number of topics in not known a priori in the context of large events. To determine the optimal number of topics for a given set of documents D we calculate two metrics, perplexity and average similarity across topics for di erent number of topics and choose a value that minimizes both metrics. For the calculation of perplexity we slit D into training and test documents, we estimate LDA over a range of possible numbers of topics using Dtrain and calculate the total perplexity of the documents in the test dataset Dtest [ 18 ]. The perplexity of a document d given a trained model is de ned as follows: perplexity(d) = exp logP (dj ; ; G)

Ld (1) perplexity(Dtest) = exp d2D logP (dj ; ; G)

P Ld d2D (2)

For the similarity between two topics, we calculate the Jaccard coe cient on the sets of top N terms of each topic.

After the detection of topics we have to associate messages with topics. We use the LDA model, estimated from the merged documents, to infer the probabilities of each message over the set of topics. We assign each message to the topic with the highest probability under the condition that this probability exceeds a prede ned threshold. Although thresholding in this step leaves some messages unassigned, this is a desirable feature of the procedure as most of the unassigned messages are of low quality. In other words these mesages can be considered as spam messages that cannot contribute any valuable information in the summary. Next, assignments are used for the creation of a data structure named StreamGrid. The rst dimension of this grid comprises the detected topics and the second corresponds to time, divided into regular time intervals. Each cell c(i; j) of StreamGrid contains the set of messages Mij associated with topici, at time interval j. Each message m is represented as a tf idf vector. The idf components are pre-computed over the whole set of messages. The tf part is the frequency of a term in the message normalized by the maximum frequency. Due to the short length of the documents in micro-blogging platforms, this component often equals to one. Using the set of associated messages in each cell, we calculate a merged tf idf vector vij . In addition, we calculate a weight for each message and rank them according to it. The weight of a message m, associated with topici, in a speci c time window j is de ned as the sum of the weights of the terms contained in m. To calculate the weight of each term t, we use the following tf idf scheme: W (t; i; j) = tfij (t) idf (t) W (m; j) = X W (t; i; j) t2m (3) (4) where Ld is the number of terms in document d, is the document-speci c topic distribution, is the word distribution for topics, and G is the set of topics in the trained model. The total perplexity over dataset Dtest is de ned as where tfij (t) is the frequency of term t 2 vij into the cell c(i; j) of StreamGrid, and idf (t) is the inverse document frequency over the whole corpus, W (t; i; j) is the weight of term t in c(i; j), and W (m; j) the weight of message m in time interval j.

To detect the time intervals that a speci c topici of StreamGrid is active, we create a topic timeline by using time intervals as bins, and counting the associated messages of topici in bin j. Then, we apply the peak detection algorithm used in [ 9 ] to detect time frames in the timeline that exhibit bursty behaviour. The algorithm identi es windows with high activity by nding signi cant increases in the timeline, compared to the historical mean value of activity. The time windows reported by the algorithm are used to set the active topics of each time interval. For example, if for a speci c topic i, the algorithm identi es a time window [a; b] with high activity, then we de ne all the time intervals a j b as active moments of topici. After this step, the cells of StreamGrid, have a ag that indicates whether a speci c cell is active or not. We use this ag to select a summary subset of messages, as described in the next paragraph. Also for each active topici in a speci c time interval j, we calculate a score that captures its signi cance over the rest of the active topics A in the same time interval.

Signif icance(topici; j) = jMij j

P topick2A jMkj j

In addition, to have an overall estimation of the importance of each topic throughout the event, we calculate two measures for each topic using a similar approach as [ 14 ]. More speci cally we de ne the peakiness of a topic as:

peakiness(topici) = persistence(topici) = maxjMij j P jMij j 8j tpaeavkg<j Pj MjMijijjj j<atvpgeak Pj MjMijijjj and its persistence as where tpeak is the time that the maximum peak of the timeline occurs. 3.3

Topic-Time Summarization

Our goal is to use the StreamGrid to summarize the event for an arbitrary time frame. As summary we denote a set of representative messages that mention the key aspects of the selected time period. Assuming that topics can capture these aspects, we use the active topics for that period to create a summary that meets the following criteria: a) as many aspects as possible are covered and b) redundancy due to near duplicate messages is minimized. To achieve this, we (5) (6) (7) use an adapted version of the greedy algorithm used in [ 17 ]. The algorithm selects messages that are associated with di erent topics and that simultaneously have low degree of textual similarity between each other. The selection process is detailed by Algorithm 1. For an arbitrary time frame F = [a; b], we rst nd the sequence of time intervals in StreamGrid that covers F . Then we get the set of active topics. A topici is active in F if any cell c(i; j) contained in F is active. Also, the signi cance score of an active topic in F is de ned as the maximum signi cance score across all time intervals in F . The weight W (t; i; F ) of a term t for topici in F is de ned as the sum of the weights in each cell c(i; j) 2 F . In a similar way, we de ne the weight W (m; F ) of message m over F . Note that although a message belongs to a speci c time interval, we use the term weights across the whole time frame to calculate the weight of m.

Algorithm 1 Topic-Time summarization

Input: StreamGrid, a time frame F , length of summary L Output: a summary set S 1: S = ; 2: A = fset of active topics in F g 3: Mc = mjargmaxW (m; i; F ); 8i 2 A

m 4: while jSj < L or Mc 6= ; do 5: for each message m in Mc do 6: calculate score(m) according to Equation 8 7: end for 8: Select mmax = argmax[score(m)]

mi 9: S = S [ fmmaxg 10: Mc = Mc fmmaxg 11: end while 12: if jSj < L then 13: M = [Mij , 8i 2 A; j 2 F 14: M 0 = M S 15: while jSj < L do 16: for each message m in M 0 do 17: calculate score(m) according to Equa

To produce a summary S of length L, the algorithm rst gets the set of active topics as described above. Then, it collects the messages Mc with the highest weight W (m; F ) in each active topic (line 3). Through the lines 4-11, the algorithm, following a greedy approach, selects the messages that maximize the score of Equation 8. This consists of two parts weighted by a parameter a. The rst part, measures the importance of the message, while the second the redundancy compared to the set of already selected messages. The importance of a message m 2 topici is a combination of two factors: a) the signi cance of the topic it belongs to, at this time frame, and b) the contribution of its textual content. To measure the redundancy of a message, we compute its average cosine similarity to the already selected messages. If the summary length is not reached, we perform the same selection process on the set of tweets that belong to the active topics (Lines 12-23). We conducted an evaluation of the proposed method on a dataset around the Sundance 2013 Film Festival that took place between January 15th and 30th, 2013. We used the Streaming API of Twitter to acquire tweets containing terms related to Sundance and posted during the event. More precisely, we collected all tweets containing the hashtags, #sundance, #sundance2013 and #sundancefest, and all the tweets that mentioned the o cial account of Sundance Film Festival (@sundancefest). This resulted in a dataset of 201,752 tweets. Among them, 100,046 were original tweets, while the rest of them were retweets. Although using three hashtags and one mentioned account covers only a subset of all possible tweets about the event, we consider this subset su ciently representative as the vast majority of Twitter's users tend to adopt the o cial hashtags provided by organizers during events. 4.2

Topic detection

Figure 2 shows the perplexity and average similarity for di erent numbers of topics K. Although there is signi cant variance for the di erent values of K, the main trend for perplexity is to decrease as K increases. As we can see from Figure 2, the average similarity between all pairs of topics appears to stabilize for values of K larger than 100 topics. However, having a large number of topics creates topics with very few associated messages. We found that for K > 200 there is a substantial proportion of topics that have no associated message. Taking into account these facts, we set K = 200 for the rest of the evaluation. Regarding the pooling scheme, merging tweets having the same hashtags into single documents gave us the best performance with respect to perplexity and average topic similarity. The rst part of Table 2 contains the top ve topics with respect to the peakiness and the second one the topics with the highest persistence ratio. Examining the set of persistent topics we conclude that they can be divided into two main categories: The rst comprises the truly persistent topics that are regularly discussed during the event, while the second category is made up of multiplexed topics that LDA failed to split further. This is due to the fact that some topics are conceptually di erent but share a similar set of related terms. This obviously a ects summarization performance, as for each topic we select only the top weighted message. Thus, if the topic contains more than one concepts then the summarization algorithm selects only one concept and ignores the rest.

Figure 3 depicts the timelines of the same two sets of topics respectively. It becomes obvious that peaky topics are highly localized, while persistent topics sustain for the whole duration of the event. To provide a visual representation of the StreamGrid structure over the whole duration of the event, we represent it as a heat map (Figure 4). The coloured cells in the grid represent the time intervals, in which the corresponding topics are active, and the color of the cell gives the signi cance of each active topic at this point. As shown in Figure 4, StreamGrid appears to be sparse, as only a few cells in it contain active topics. However, one can also observe several topics (rows) that exhibit consistent activity over the whole duration of the event. 4.4

Summarization

Baselines: To evaluate the summaries produced with StreamGrid, we used ve baseline methods. Given an arbitrary time interval, we rst get the set of messages posted during this interval and then we apply the following baselines to produce a summary of constant length L.

Random Summarizer: For the set of tweets we choose randomly a subset of L tweets.

Popularity Summarizer: We select the L most retweeted messages to form a summary. This favours the tweets that have attracted the attention of the audience. However, niche topics and potentially interesting events that gathered less attention tend to be missed. tf idf Summarizer: We use the tf idf weighting scheme described in the previous section to get the L highest weighted tweets.

Cluster-based Summarizer: Instead of active topics, we divide the tweets of the time interval into L clusters using k-means clustering. For each cluster produced this way, we pick the highest weighted tweet using the tf idf scheme.

LexRank Summarizer: We create a graph where nodes represent tweets and the weights of edges between nodes represent their pairwise cosine similarity. The total weight of a tweet is the sum of the weights of the adjacent edges. The summary consists of the L highest weighted tweets in the graph.

Finally, we compare the results of the StreamGrid Summarizer to the ones of the baseline methods for ve time intervals that are connected with high activity during the main event. We detect these intervals by applying the peak detection algorithm of the previous section to the timeline of the whole dataset. We rank the detected bursts according to the rate of tweets and use the top ve of them. The details of these intervals are provided in Table 1.

Table 3 contains summaries consisting of ve tweets using StreamGrid and three of the baselines for the time period around the Awards Ceremony of Sundance Film Festival. Unsurprisingly, this is the time period with the highest peak during the event. During this period what may be reasonably considered as important pertains to the lms that won awards. Such messages are usually posted by authoritative users and become highly retweeted. For this reason, summaries based on the number of retweets cover quite e ectively the winning lms. However, in other cases choosing very popular tweets does not lead to informative summaries. For example in the third time interval, the summary consists of tweets like \So freaking cool. #sundance http://t.co/C7a8rSaw" and \#Sundance day 4- leavin for Vegas now. Bye for now http://t.co/C2aRZnEC". These tweets were retweeted a lot, but may be considered as non-informative for the event. On the other hand, StreamGrid-based summaries for the Awards Ceremony contain tweets about winning lms, even though these messages are not very popular. That is an indication that StreamGrid may detect an important topic even in cases that this does not attract attention from many users. Regarding the Cluster-based Summarization, an interesting feature is that avoidance of redundancy is inherent in the method, as similar messages are clustered together, and only the most weighted of them are selected for the summary. However, the weakness of the method is that not all clusters represent important aspects of the event.

Another indication of how topic modelling can improve summarization is the fact that StreamGrid, compared to the other baselines, tends to include tweets that mention lms. The reason that this happens is that most of the topics detected by LDA are about lms, so when the proposed summarization algorithm selects a set of tweets from the pool of active moments, this leads to the selection of lm-associated tweets. We expect that, for other types of events, it will naturally generalize to other pertinent entities of interest that occur frequently, thus leading to the creation of topics. A noticeable disadvantage of baselines such as tf idf and LexRank is the remarkable existence of redundancy. For example in case of LexRank four out of ve tweets are related to the 'Fruitvale' lm. This indicates that redundancy minimization is a necessary component of any summarization approach.

Finally, to evaluate how well the proposed method can create visual summaries, we apply it on the subset of tweets with embedded pictures. These tweets that comprise about 10% of the dataset create a considerably sparser StreamGrid as the bursty periods in this subset are much fewer. An example of a multimedia summary using StreamGrid for the Awards Ceremony is shown in Figure 5. Comparing the StreamGridbased multimedia summaries with the ones produced by the popular images (6), we observe that StreamGrid does not perform noticeably better in this task. This can be explained by the fact that tweets with embedded media have text of very low length and informativeness, which leads LDA to inferior performance with respect to the creation of representative topics and the assignment of messages to them. Regarding the redundancy in multimedia summaries, we found that using cosine similarity on the text of images as a metric of similarity between them is not appropriate to minimize redundancy. This can be seen in the LexRank-based summary in Figure 7. To this end, a combination of visual and textual features is foreseen as a more suitable means for discarding similar images. 5

Conclusion and future work

In this work, we proposed a framework for the summarization of micro-blogging messages during large scale events. The framework makes use of topic modelling to detect the underlying aspects of an event to the set of related messages. Then, for each topic it derives its temporal representation by associating messages to the discovered topics. Subsequently, a burst detection algorithm is used to nd the important intervals for each topic. Finally, a greedy summarization algorithm generates summaries for arbitrary time intervals using the set of active topics for the same time duration. The results of experiments in a Twitter dataset around the Sundance Film Festival appear promising, demonstrating the potential of topic modelling on the multi-document summarization problem.

For future work, we rst plan to compare our approach with competing summarization algorithms in a more systematic way, over more events and with the help of independent evaluators, with the goal of better capturing the subjective quality aspects of summarization. Taking into account the large number of topic modelling techniques that appeared in literature over the last years, we plan to investigate how the underlying model a ects the summarization process. Furthermore, we intend to create a real-time version of StreamGrid, which could be used to get summaries of evolving and continuous streams of messages. To this end, we plan to employ more advanced topic modelling methods that can detect topic drift and unseen topics on new incoming messages. Finally, we will investigate methods to integrate popularity and user authority into the summarization process.

Acknowledgements: This work is supported by the SocialSensor FP7 project, partially funded by the EC under contract number 287975.

Method

tf idf

LexRank Popularity StreamGrid

[1] Celebrating

SB48 on Twitter . https://blog.twitter.com/2014/ celebrating-sb48 - on-twitter, 2014 . [Online; accessed 27-Feb-2014].

[2]

Alonso and

Shiells . Timelines as summaries of popular scheduled events . In Proceedings of the 22nd international conference on World Wide Web companion , pages 1037 { 1044 . International World Wide Web Conferences Steering Committee, 2013 .

[3]

D. M.

Blei ,

A. Y.

Ng , and

M. I.

Jordan . Latent dirichlet allocation . J. Mach. Learn. Res. , 3 : 993 { 1022 , Mar . 2003 .

[4]

Chakrabarti and

Punera . Event summarization using tweets . In ICWSM , 2011 .

[5]

Dork ,

Gruen ,

Williamson , and

Carpendale . A visual backchannel for largescale events . Visualization and Computer Graphics , IEEE Transactions on, 16 ( 6 ): 1129 { 1138 , 2010 .

[6]

Erkan and

D. R.

Radev . Lexrank: Graphbased lexical centrality as salience in text summarization . J. Artif. Int. Res. , 22 ( 1 ): 457 { 479 , Dec . 2004 .

[7]

Lin ,

Li ,

Wang ,

Chen , and

Li . Generating event storylines from microblogs . In Proceedings of the 21st ACM International Conference on Information and Knowledge Management , CIKM '12 , pages 175 { 184 , New York, NY, USA, 2012 . ACM.

[8]

Liu ,

Liu , and

Weng . Why is "sxsw" trending? exploring multiple text sources for twitter topic summarization . In Proceedings of the ACL Workshop on Language in Social Media (LSM) , pages 66 { 75 , 2011 .

[9]

Marcus ,

M. S.

Bernstein ,

Badar ,

D. R.

Karger ,

Madden , and

R. C.

Miller . Twitinfo: Aggregating and visualizing microblogs for event exploration . In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, CHI '11 , pages 227 { 236 , New York, NY, USA, 2011 . ACM.

[10]

Mehrotra ,

Sanner ,

Buntine , and

Xie . Improving lda topic models for microblogs via tweet pooling and automatic labeling . In Proceedings of the 36th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR '13 , pages 889 { 892 , New York, NY, USA, 2013 . ACM.

[11]

Nichols ,

Mahmud , and

Drews . Summarizing sporting events using twitter . In Proceedings of the 2012 ACM International Conference on Intelligent User Interfaces , IUI '12 , pages 189 { 198 , New York, NY, USA, 2012 . ACM.

[12]

D. R.

Radev ,

Jing ,

Stys , and

Tam . Centroid-based summarization of multiple documents . Inf . Process. Manage., 40 ( 6 ): 919 { 938 , Nov . 2004 .

[13]

Schinas ,

Papadopoulos ,

Diplaris ,

Kompatsiaris , Y. Mass,

Herzig , and

Boudakidis . Eventsense: Capturing the pulse of large-scale events by mining social media streams . In Proceedings of the 17th Panhellenic Conference on Informatics, PCI '13 , pages 17 { 24 , New York, NY, USA, 2013 . ACM.

[14]

D. A.

Shamma ,

Kennedy , and

E. F.

Churchill . Peaks and persistence: Modeling the shape of microblog conversations . In Proceedings of the ACM 2011 Conference on Computer Supported Cooperative Work, CSCW '11 , pages 355 { 358 , New York, NY, USA, 2011 . ACM.

[15]

Shari , M. -

A. Hutton , and J.

Kalita . Summarizing microblogs automatically . In Human Language Technologies : The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics , HLT '10 , pages 685 { 688 , Stroudsburg , PA, USA, 2010 . Association for Computational Linguistics .

[16]

Shen ,

Liu ,

Weng , and

Li . A participant-based approach for event summarization using twitter streams . In Proceedings of NAACL-HLT , pages 1152 { 1162 , 2013 .

[17]

Shou ,

Wang ,

Chen , and

Chen . Sumblr: Continuous summarization of evolving tweet streams . In Proceedings of the 36th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR '13 , pages 533 { 542 , New York, NY, USA, 2013 . ACM.

[18] H. M. Wallach , I. Murray ,

Salakhutdinov , and

Mimno . Evaluation methods for topic models . In L. Bottou and M. Littman, editors, Proceedings of the 26th International Conference on Machine Learning (ICML) , pages 1105 { 1112 , Montreal, June 2009. Omnipress.

[19]

Wang ,

Li , and

Ogihara . Generating pictorial storylines via minimum-weight connected dominating set approximation in multiview graphs . In AAAI'12 , pages { 1 { 1 , 2012 .