Using Uncertain Graphs to Automatically Generate Event Flows
                      from News Stories
             Laura Christiansen                              Bamshad Mobasher                                   Robin Burke
          Center for Web Intelligence                      Center for Web Intelligence                  Center for Web Intelligence
                Chicago, Illinois                               Chicago, Illinois                            Chicago, Illinois
           lchris10@cdm.depaul.edu                          mobasher@cs.depaul.edu                        rburke@cs.depaul.edu

ABSTRACT                                                                    lead to differing levels of certainty in the likelihood of those connec-
Capturing the branching flow of events described in text aids a host        tions. This lends itself to probabilistic or uncertain graphs, where
of tasks, from summarization to narrative generation to classifica-         edges have probabilities of their existence. An uncertain graph is
tion and prediction of events at points along the flow. In t his paper,     not discrete but is rather a template to generate all "possible worlds":
we present a framework for the automatic generation of an uncer-            all discrete graphs that are drawn from the edge probabilities.
tain, temporally directed event graph from online sources such as               We propose a framework for automatically extracting an un-
news stories or social media posts. The vertices are generated using        certain event graph, with edges directed by the temporal flow of
Natural Language Processing techniques on the source documents              events, from online sources such as news stories or social media
and the probabilities associated with edges, indicating the degree          posts. The first stage of this process involves event extraction us-
of certainty those connections exist, are derived based on shared           ing natural language processing (NLP) techniques. Part-of-speech
entities among events. Graph edges are directed based on temporal           (POS) tagging and semantic role labeling (SRL) allow us to extract
information on events. Furthermore, we apply uncertain graph clus-          predicates, which we treat as events. Entity detection labels what
tering in order to reduce noise and focus on higher-level event flows.      named entities are involved in those events while temporal expres-
Preliminary results indicate the uncertain event graph produces a           sion extraction defines the temporal ordering between events. Next,
coherent navigation through events described in a corpus.                   we generate the edges and their probabilities based on Bayesian
                                                                            combination of evidence. As basic evidence, we use the extracted en-
Reference Format:
                                                                            tities shared between events and the proximity of event references
Laura Christiansen, Bamshad Mobasher, and Robin Burke. 2017. Using
Uncertain Graphs to Automatically Generate Event Flows from News Stories.   within the text. We focus on simple, text-based evidence of events
In Proceedings of Workshop on Social Media World Sensors at ACM Hypertext   and their connections but more complex information derived from
2017 (SIDEWAYS, HT’17), 6 pages, CEUR-WS.org.                               metadata can be utilized. Different domains offer different possible
                                                                            sensors for detecting and tracking events. This process generates
                                                                            the vertices and edges of the uncertain graph, and those edges
1 INTRODUCTION                                                              can then be directed based on the temporal ordering information
Extracting a narrative progression from text opens the door for a           discovered in the first stage.
host of useful applications. Representations of the key stories can be          Once the full event graph is generated, we use uncertain graph
simplified or expanded upon to aid comprehension. Examining the             clustering to reduce noise and discover higher-level abstractions,
dynamics of the narrative events can reveal emergent information            with clusters indicating closely related events describing a larger,
and points of change that may be useful not only in understanding           meta-topic within the graph. We use pKwikCluster, a clustering
the story but in predicting future dynamics. One can observe how            algorithm for uncertain graphs, to identify likely clusters. As a
paths differ when looking at different domains, such as news sources        precursor to a larger user study evaluation, we observe the flow
versus social media, providing insight in how both represent events.        and connections within the graph to evaluate its coherency and
Understanding the flow of information over time is valuable.                correctness. Our preliminary results on a dataset consisting of news
   Intuitively, we understand that flow is not flat. One event may          articles indicates this is a viable approach to automatically capturing
branch out to connect to events later in time. Likewise, many events        and depicting a branching flow of events.
may feed into a single event. An extracted timeline for a narrative
will capture the temporal ordering but lose information on the
                                                                            2    RELATED WORK
connections between events. On the other hand, evidence of a con-
nection between events may be incorrect. Inferring connections can          Linking and tracking events is a research problem that has been
                                                                            addressed from a number of angles. Extending their previous work
                                                                            in event extraction, Rospocher et al. [6] propose a approach for
                                                                            automatically generating knowledge graphs based on the discov-
                                                                            ered events. In this knowledge graph, edges are predicates and
                                                                            nodes are entities as opposed to combining both within an event
                                                                            construct. Moving in the opposite direction, Althoff et al. [2] gener-
                                                                            ate timelines from knowledge graphs. The generated timelines are
                                                                            personalized and provide a temporal ordering but not branching
SIDEWAYS 2017, Prague, Czech Republic                                       connections. Shahaf et al. [8] developed an algorithm for generating
Copyright held by the author(s).
                                                                            zoomable, intersecting timelines of key terms to summarize news.
SIDEWAYS, HT’17, July 4, 2017, Prague, Czech Republic                             Laura Christiansen, Bamshad Mobasher, and Robin Burke


These timelines are constructed in relation to each other and the         identifies which parts of speech the different terms in a sentence
terms that make up the nodes are annotated with news stories and          have as well as identifying the sentence structure and with SRL, the
is only intended for a high-level event representation.                   roles entities play within a predicate can be further identified. Take
    Event detection and extraction has been approached in a number        the sentence "John bought a car in Boston"; using dependency pars-
of ways. Work in [7] discovers a specific type of event, earthquake       ing and SRL, we can identify "bought" as the predicate verb, "John"
occurrences, from microblogs. User chatter on earthquakes is clas-        as the subject, "a car" as the object and "in Boston" is the location.
sified and filtered to act as sensors to determine when and where         We consider predicates references to events rather than events; this
an earthquake strikes. Also working with microblogs, [1] clusters         distinction is important as the same event may have multiple refer-
tweets based on keywords and locations to detect new events. Key-         ences. To examine the most basic references and avoid redundancy,
words, combined with the time and place they were posted, form a          we focus on the smallest predicates. Predicates containing other
rough event reference. In [10], events are automatically pulled from      predicates were pruned.
streaming news data; news relations are extracted then clustered              Co-reference resolution, another established NLP task, enables
to find different representations of the same event before training       discovery of multiple representations of the same event within the
a model to extract news relations based on that co-reference in-          same document. Two predicates co-referencing each other indicate
formation. This attempts to overcome issues of different linguistic       the same event is discussed. Our definition of an event can now
descriptions of the same event. We only address this issue indi-          encompass multiple predicates based on co-references. To expand
rectly, through co-reference resolution, but similarly are interested     our example, if another sentence read "He bought it last Friday",
in basic relations in the form of verb predicates.                        co-reference resolution can tell us if this instance of "bought" is
    When constructing networks, there may be doubts regarding             referring to the same event as the predicate verb in the first event.
the accuracy of connections between nodes are due to the tech-            Similarly, it can tell us if "he" refers to "John". This helps better
niques used to construct those connections. Link prediction may           define the entities involved in an event; in our event reference we
be erroneous or sensors may have detected noise. Uncertain graphs         can substitute the more informative proper noun for the pronoun.
tackle this problem by assigning probabilities of existence to edges;         This substitution is further enhanced by Named Entity Recogni-
an uncertain graph is the template with which to generate a set of        tion (NER). NER identifies and classifies of named entities as people,
possible discrete graphs based on those probabilities. For example,       locations, or organizations. To continue our example, NER would
Zhao et al. [11] use uncertain graphs to detect protein complex struc-    identify "John" as a person and "Boston" as a location. Combining
tures. In their graphs edges are interactions between structures, but     this with co-reference resolution, we can find all co-references to a
there is noise in data related to when they interact. Prachas et al.      named entity and include the entity information in those events.
[5], propose a method to generate the best discrete approximation             Finally, the temporal relationships between events can be as-
from an uncertain graph. We avoided generating discrete graphs            certained through temporal expression extraction. In some cases,
from our uncertain graph, relying on algorithms that approximate          this finds the fixed time interval described in the text. In others, it
calculations over a discrete graph set. Clustering algorithms are         is is relative. The exact date of event E 1 might not be known but
extended to uncertain graphs by Kollios et al. [4], and we use their      we know it took place before event E 2 . be places parts of the text
adaption of pKwikCluster and definition of estimated edit distance        in time. Another sentence might tell us "Afterwards, John bought
over an uncertain graph to aggregate our event vertices. Bonchi et        coffee"; we can label the coffee purchase event as occurring after
al. [3] examine how to perform core decomposition in an uncertain         the car buying. By knowing this, we know the temporal flow of
graph context, an approach we did not use but may be useful in            events.
creating higher-level representations of an uncertain event graph.            We use the English language version of the Newsreader 1 pipeline
                                                                          to perform these NLP tasks on our dataset. Described in [9], the
3     UNCERTAIN EVENT GRAPHS                                              pipeline is a series of NLP modules intended primarily for news
                                                                          text. We are not using the output of the entire pipeline; instead, our
In this section, we describe our method for constructing a tempo-
                                                                          focus is POS tagging, dependency parsing and SRL, co-reference
rally directed, uncertain event graph. First, we extract the necessary
                                                                          resolution, NER, and temporal relations. Each event has at least
information from text using a variety of NLP techniques to con-
                                                                          one predicate representation and includes information on the roles
struct the event vertices. Once events are defined, we proceed to
                                                                          within that predicate as well as any named entities involved. If an
the definition of edges, their probabilities, and their direction. Fi-
                                                                          event contains no entities, it is removed. This describes our event
nally, we aggregate events within clusters to provide a higher-level
                                                                          extraction from within a single text source.
representation of the event graph structure.
                                                                              The same events may also be referenced between documents,
                                                                          which is not identified by the techniques described. To tackle this,
3.1    Event extraction                                                   we first look for the date range of the events. Events whose known
The first stage is to discover the events described in the data. At a     time intervals overlap are candidates to be combined. We also in-
basic level, this process includes the identification of an action that   clude events without an explicit time interval but whose document
occurs and the entities involved. To this end, we define our initial      publication dates are within a day of each other. This extension
event references in terms of predicates. Predicates define actions        makes sense in the context of news articles but should be omitted
within a sentence and serve as an anchor point for additional details     or replaced for other datasets. Candidates for merging then have
involving the subjects and objects. To extract this information from
text, we need to run POS tagging and dependency parsing. This             1 http://www.newsreader-project.eu/
Using Uncertain Graphs to Automatically Generate Event Flows from News Stories
                                                                            SIDEWAYS, HT’17, July 4, 2017, Prague, Czech Republic


their term sets compared via Jaccard similarity, defined in equation
                                                                                                                m Õ n
1. These term sets are pruned to exclude conjunctions, articles, and                                                   dist(pi , p j )
                                                                                                              Õ                       
                                                                                                          1
punctuation. Any candidates with a Jaccard similarity greater than                      P(D|L) = 1 −                                                 (4)
                                                                                                       (m × n) i=1 j=1     td
threshold α are combined.
                                                                               Additive smoothing of 0.1 is applied to equations 3 and 4 so
                                       (A ∩ B)
                          J (A, B) =                                 (1)    that event pairs with a probability of 0 in one are not immedi-
                                       (A ∪ B)
                                                                            ately removed from the graph.For each pair of events, the value
3.2     Edge generation                                                     of P(L|En, D) is calculated. In the uncertain graph, this represents
                                                                            the probability the two events are linked and subsequently that an
The vertices in the uncertain graph are the events we’ve just de-           edge exists. Additional evidence can easily be incorporated using
scribed. The next stage is to generate the edges in the graph graph.        Bayes’ rule, extending equation 2. Taking into account the tempo-
For this preliminary work, we examine basic relationships indicat-          ral information extracted during the NLP stage, we can direct the
ing two events are connected. As we are constructing an uncertain           uncertain graph edges based on the temporal information. This
graph, this requires computing the probability that a link between          is either done by comparing the explicit time interval or through
two events exists given the evidence at hand. Entity similarity and         relative temporal relations. Edges are omitted from the graph if
document co-occurrence proximity between events are the types               we have no temporal information and the edge is directed from
of evidence we use in the proposed approach. The first measure              the earlier to the later event. If both events occur simultaneously,
examines whether the same entities are involved in two events. We           then the edge is bidirectional. Events without any possible edges
posit two events sharing entities are more likely to be related than        are pruned from the graph. We now have a temporally directed
those that don’t. The second measure, intra-document proximity,             uncertain graph of events.
operates on the assumption that an author is not jumping from tan-
gent to tangent within their writing; the closer two the descriptions
of two events are within the text of a document, the more related
                                                                            3.3    Event abstraction
we can assume those events are.                                             For ease of quickly interpreting a large event graph, some degree of
   Given those assumptions, we define the probability of a link             aggregation and abstraction is useful. It provides a simpler represen-
existing between two events given their entities and intra-document         tation and further information on how the events are related to one
proximity. Assuming both sources of evidence are conditionally              another. To accomplish this aggregation, we turn to graph-based
independent, we calculate the probability of a link existing given          clustering. This is complicated by as the event graph is not discrete
their evidence with equation 2 using Bayes rule. Let L represent            but rather an uncertain graph that can be used to generate a large
whether two events are linked, En the shared entities between               set of discrete graphs.
events, and D the the intra-document proximity between events.                 In [4], clustering methods were extended to apply to uncertain
For P(L), we assume an ignorant prior of 0.5.                               graphs. We borrow their pKwikCluster, an adaptation of kwik-
                                                                            Cluster, to find event clusters within the graph. The pKwikCluster
                                                                            algorithm is simple: pick an available vertex at random as a new
                                P(L)P(En, D|L)
                  P(L|En, D) =                                              cluster, add to that cluster all available neighbors with an edge prob-
                                   P(En, D)
                                                                     (2)    ability greater than 0.5, then mark all vertices in the new cluster
                                P(L)P(En|L)P(D|L)
                              =                                             as unavailable. Repeat these steps until every vertex is part of a
                                     P(En, D)                               cluster. This algorithm is run multiple times and the results of each
where                                                                       run are compared to determine which produced the best clustering.
                                                                               The goodness of a clustering result can be evaluated, in part,
      P(En, D) = P(L)P(En|L)P(D|L) + P(¬L)P(En|¬L)P(D|¬L)                   by the edit distance. For the non-probabilistic kwikCluster, this is
   P(En|L) is defined here as the average Jaccard similarity between        between the graph and the cluster graph. Assume all edges between
the predicates of two events. This is an average as, either through         clusters are omitted; the edit distance indicates how many changes
co-referencing predicates or combination of events between docu-            needed to be made to the base graph’s structure to accommodate
ments, an event can have more than one predicate representation. In         that result. The clustering run that minimizes this metric is selected.
equation 3, m and n represent the number of predicates describing
events E 1 and E 2 respectively, while Eni and En j refer to the entity
                                                                                                     Õ                            Õ
                                                                                      D(G, Q) =                 (1 − Puv ) +                (Puv )   (5)
set for predicates pi and p j . J (Eni , En j ) is the Jaccard similarity                         {u,v } ∈E Q                  {u,v }<E Q
between entity sets Eni and En j .
                                                                               In an uncertain graph context, this becomes the edit distances
                               Õm Õ n                                     between the cluster graph and all possible worlds generated by
                           1
              P(En|L) =                 J (Eni , En j )              (3)    the uncertain graph. Rather than compute that daunting metric,
                        (m × n) i=1 j=1
                                                                            we can instead compute a single estimated edit distance between
   P(D|L), referring to the intra-document proximity, is defined in         the uncertain graph and cluster graph. This is shown in equation
equation 4. Here, dist(pi , p j ) is a simple measure of the distance       5. Puv is the edge probability between vertices u and v, EQ refers
between the sentences in which predicates pi and p j occur, while           to all edges within a cluster, Q is the set of clusters, and G is the
td is the total number of sentences in document d.                          uncertain graph.
SIDEWAYS, HT’17, July 4, 2017, Prague, Czech Republic                          Laura Christiansen, Bamshad Mobasher, and Robin Burke


      Figure 1: Uncertain graph for US election dataset                          Figure 2: Partially labeled uncertain graph


4    RESULTS
We ran our framework on a selection of 190 news articles from The
Guardian 2 covering the US election. These articles covered a range
of dates from January 2015 to January 2017. Clustering was run
100 times and the run with the smallest edit distance for its cluster
assignments is the clustering described here. As discussed earlier,
events are pruned from the graph if they lack entity information
and edges are pruned if they lack information on their temporal
direction. We omitted edges with probabilities lower than 0.1 and
event vertices with no connecting edges. This results in 1606 events
and 15779 edges. Figure 1 shows a visualization of the uncertain
graph. The nodes and edges are shaded based on cluster assignments
for events; there are 1053 clusters in total.
   In Figure 2, we’ve labeled some of the main clusters within the
graph based their focus on the candidates. Table 1 describes the
major events from the clusters pertaining to candidates Marco
Rubio and Chris Christie; if an event is listed as having an edge
with another event ID, that indicates the event points to that event.
This event either preceded the other unless a complementary edge
exists in the opposite direction; in that case, the events occurred
simultaneously.                                                                   Figure 3: Subgraph for Rubio and Christie
   We can follow the branching path through these events. Events
1-8 apply to Rubio while 8-10 are focused on Christie and we can
observe how the flow of this subgraph moves, as well as how these
                                                                        Rubio should bow out of the primaries in event 2. The Rubio and
two clusters connect. Figure 3 visualizes this subgraph. For example,
                                                                        Christie clusters meet at the point where the discussion shifts to
Rubio discussed childhood taunts directed at family in event 3,
                                                                        what the candidates will do given the New Hampshire primaries.
which occurred at roughly the same time as he was characterized
                                                                        Further, we can see that event 8 is speculating before that primary
as the mainstream Republican candidate with the best chance of
                                                                        while event 7, which came after, is discussing the results. The least
winning the nomination in event 5. Event 3 also took place before
                                                                        likely edge in the subgraph was between 8 and 9, with a probability
the discussion of attack ads against Rubio in event 1, the attack
                                                                        of 0.20861. These are events that are both discussing Christie but
Donald Trump for having small hands in 4, and the suggestion
                                                                        only one explicitly includes Christie as an entity; similarly, they
2 http://www.theguardian.com                                            co-occur in the same document but not particularly near each other.
Using Uncertain Graphs to Automatically Generate Event Flows from News Stories
                                                                            SIDEWAYS, HT’17, July 4, 2017, Prague, Czech Republic


   ID Event                                     Edges
   1  ads attacking Rubio                      2,4
   2  Rubio should bow out                     4
   3  [Rubio’s] mom doesn’t even swim          1,2,4,5
   4  Rubio has accused him of having small -
      hands
   5 when [Rubio] is characterized [...]       1,2,3,4,6,7
   6 similar candidates have had to drop       5,7
   7 [Rubio] comes in the top three in New 5,6
      Hampshire
   8 a swift exit after New Hampshire seems 7,9
      likely                                                                           Figure 4: Subgraph for Stein recount
   9 money talks so don’t count [Christie] out 8,10
      yet
   10 another party heavyweight once tipped to -                          where candidate Ted Cruz told voters Ben Carson had left the race,
      go far                                                              so Carson voters should vote for Cruz. The connection is valid and
                                                                          the direction, from the former to the latter, is accurate. If anything,
        Table 1: Events from Rubio and Christie clusters                  it makes intuitive sense that the link should be stronger but the
                                                                          entity mismatch was detrimental. These smaller components also
                                                                          capture tangents to the larger theme of the dataset. One pair of
   Turning our attention to a clustered area further away from the        events, "she was arrested in Cairo" and "Egyptian courts would let
center of the central component, we can see how events related            her go free", refers to a US citizen, Aya Hijazi, who was arrested
to the recount Jill Stein funded in three states appear in the graph.     in Cairo. One of the articles in the dataset consisted of excerpts
Table 2 lists the events and their edges while figure 4 illustrates       for a number of current stories, some of which were not related to
the connections. This subgraph contains events from two separate          the election. The graph generation correctly identified these two
clusters as events 2-4 are in one and event 1 is in another. Event 1, a   events were related but, understandably, could not connect them
recount being initiated in Wisconsin, is captured as co-occurring in      to the main component.
time with event 2, in which Jill Stein requesting recounts in multiple       Our initial uncertain graph of events is coherent and appears to
states. Event 2 branches into 3 and 4. Event 3 covers the funds Stein     provide a good temporal flow through the graph; we have main-
raised throughout her campaign to trigger recounts, which the             tained that flow while allowing event paths to branch. The clus-
graph shows as occurring after Stein’s initial recount requests and       tering as a form of aggregating events was useful in analyzing the
event 4, Stein discussing the Trump campaign’s effort to stop the         graph. Any issues with the clustering would seem to be an issue
recounts. While events 1 and 2 share terms, they share no entities,       with the factors involved in linking events, perhaps overrating some
so that edge is based solely on intra-document proximity.                 edges between nodes with only a single named entity or underrat-
                                                                          ing edges that lack entity overlap. Some of the weaknesses described
                                                                          would appear to be solvable by the inclusion of more text to build
   ID    Event                                          Edges             the graph, which would provide further definition connections for
   1     a recount has been initiated in Wisconsin     2                  the event references. Another possibility would be to expand the
   2     [Stein] requesting the recounts               1,3,4              definition of an event reference; we have the dependencies from the
   3     the large funds Stein has raised throughout   -                  text and can represent more structure than the individual, smallest
         this process                                                     predicates containing sufficient features. Finally, we began with
   4     the Trump campaign’s cynical efforts to de-   3                  two forms of evidence but more might be useful. Incorporating
         lay the recount [...] are shameful and out-                      additional sources could strengthen connections between nodes
         rageous," Stein said                                             that lack significant entity overlap. The importance of this is likely
          Table 2: Events from Stein recount clusters                     increased when not dealing as noise in the dataset increases.

                                                                          5   CONCLUSION AND FUTURE WORK
   The scattered outer circle of events and edges in figure 1 is          We have presented a novel approach to automatically generate an
comprised of smaller connected components and provides a use-             uncertain event graph from a text dataset and shown anecdotally
ful illustration of the weaknesses of the limited pool of evidence        how the results are cogent and accurate. The dataset used here
we currently consider when constructing the graph. Often, these           consists of online news articles but the proposed approach could be
components are comprised of events that lack neighbors because            applied other online sources such as social media posts from which
of insufficient temporal information and mismatching entities. For        events and entities can be extracted. Our immediate next steps are
example, the events described by "fellow candidate Ben Carson is          to augment our initial definition of an event to pull in more sentence
leaving the race" and "caucus (or vote) for Cruz" are connected with      structure and incorporate additional forms of evidence in addition
a probability of 0.27675; they share no entities but co-occurred in a     to named entities and intra-document proximity. From there, we
document. What they describe is an event during the Iowa caucuses         will more comprehensively evaluate the accuracy of the uncertain
SIDEWAYS, HT’17, July 4, 2017, Prague, Czech Republic                                     Laura Christiansen, Bamshad Mobasher, and Robin Burke


graph via a user study, asking participants to assess whether pairs
of events are connected and comparing their aggregate results with
the uncertain graph probabilities. Finally, we intend to use the graph
to predict dynamics and links within the graph and to examine how
to create comparable graphs from social media sources.

REFERENCES
 [1] Abdelhaq, H., Sengstock, C., and Gertz, M. Eventweet: Online localized
     event detection from twitter. Proceedings of the VLDB Endowment 6, 12 (2013),
     1326–1329.
 [2] Althoff, T., Dong, X. L., Murphy, K., Alai, S., Dang, V., and Zhang, W. Timema-
     chine: Timeline generation for knowledge-base entities. In Proceedings of the
     21th ACM SIGKDD International Conference on Knowledge Discovery and Data
     Mining (2015), ACM, pp. 19–28.
 [3] Bonchi, F., Gullo, F., Kaltenbrunner, A., and Volkovich, Y. Core decomposi-
     tion of uncertain graphs. In Proceedings of the 20th ACM SIGKDD international
     conference on Knowledge discovery and data mining (2014), ACM, pp. 1316–1325.
 [4] Kollios, G., Potamias, M., and Terzi, E. Clustering large probabilistic graphs.
     IEEE Transactions on Knowledge and Data Engineering 25, 2 (2013), 325–336.
 [5] Parchas, P., Gullo, F., Papadias, D., and Bonchi, F. The pursuit of a good possi-
     ble world: extracting representative instances of uncertain graphs. In Proceedings
     of the 2014 ACM SIGMOD international conference on management of data (2014),
     ACM, pp. 967–978.
 [6] Rospocher, M., van Erp, M., Vossen, P., Fokkens, A., Aldabe, I., Rigau, G.,
     Soroa, A., Ploeger, T., and Bogaard, T. Building event-centric knowledge
     graphs from news. Web Semantics: Science, Services and Agents on the World Wide
     Web 37 (2016), 132–151.
 [7] Sakaki, T., Okazaki, M., and Matsuo, Y. Earthquake shakes twitter users: real-
     time event detection by social sensors. In Proceedings of the 19th international
     conference on World wide web (2010), ACM, pp. 851–860.
 [8] Shahaf, D., Yang, J., Suen, C., Jacobs, J., Wang, H., and Leskovec, J. Information
     cartography: creating zoomable, large-scale maps of information. In Proceedings
     of the 19th ACM SIGKDD international conference on Knowledge discovery and
     data mining (2013), ACM, pp. 1097–1105.
 [9] Vossen, P., Rigau, G., Serafini, L., Stouten, P., Irving, F., and van Hage,
     W. R. Newsreader: recording history from daily news streams. In LREC (2014),
     pp. 2000–2007.
[10] Zhang, C., Soderland, S., and Weld, D. S. Exploiting parallel news streams for
     unsupervised event extraction. Transactions of the Association for Computational
     Linguistics 3 (2015), 117–129.
[11] Zhao, B., Wang, J., Li, M., Wu, F.-X., and Pan, Y. Detecting protein complexes
     based on uncertain graph model. IEEE/ACM Transactions on Computational
     Biology and Bioinformatics (TCBB) 11, 3 (2014), 486–497.