<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Evolution of Semantically Identi ed Topics</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Victor Mireles</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Artem Revenko</string-name>
          <email>artem.revenkog@semantic-web.com</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Semantic Web Company</institution>
          ,
          <addr-line>Vienna</addr-line>
          <country country="AT">Austria</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Topics in a corpus evolve in time. Describing the way this evolution occurs helps us to understand the change in the prominence of concepts: one can gain intuition about what concepts become more important in a given topic, which substitute others, or which concepts become related. By de ning topics as weighted collections of concepts from a xed taxonomy, it is possible to know if said evolution occurs within a branch of taxonomy or if hitherto unknown semantic relationships are formed / dissolved with time. In this work, we analyze a corpus of nancial news and reveal the evolution of topics composed of concepts from the STW thesaurus. We show that using a thesaurus for building representations of documents is useful. Furthermore, the di erent abstraction levels encoded in a taxonomy are helpful for detecting di erent types of topics.</p>
      </abstract>
      <kwd-group>
        <kwd>topic discovery</kwd>
        <kwd>taxonomy</kwd>
        <kwd>thesaurus</kwd>
        <kwd>topic evolution</kwd>
        <kwd>topic modelling</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>Of the many dimensions that can be ascribed to text corpora, the topics that they
deal with is a very intuitive one for human readers. In a sense, topics constitute
subgraphs of a \platonic knowledge graph", where only certain concepts and
certain relations exist. Two qualities of topics are of particular interest: they
re ect the intentions and context of the author of a text, and they are often
treated in di erent texts by di erent authors. For these reasons, understanding
their evolution in time can be treated as a proxy for studying the context of the
authors.</p>
      <p>In the case of corpora of news articles, understanding the evolution of topics
can give insights into which entities become important in a given topic, or how
they loose their importance. Furthermore, the detection of emerging and fading
topics can be of interest for signalling major events.</p>
      <p>In this work, we approach the study of topic evolution by using controlled,
semantically enriched vocabularies. With our method, it is possible to describe
the topics present in a certain time in terms of the topics present in the previous
time points. With this in hand, the method is able to recover stable topics that
are consistently dealt with in the news. Furthermore, detection of important
events that shift the composition of topics is also possible. Finally, we perform
an analysis of the topic-identi cation power of di erent levels of abstraction, as
de ned by a thesaurus.
1.1</p>
    </sec>
    <sec id="sec-2">
      <title>Topic Discovery</title>
      <p>
        Topic discovery, also known as topic modelling, is the task of analyzing a corpus
and extracting from it clusters of terms that are semantically related.
When approached with statistical tools, the semantic relationship of the
discovered topics is deduced from their distributional properties: terms that co-occur
more often are deemed to be semantically related. Usually topic discovery is
approached by rst representing documents in terms of bags of words, n-grams[
        <xref ref-type="bibr" rid="ref16">16</xref>
        ]
or embedded representations[
        <xref ref-type="bibr" rid="ref11">11</xref>
        ], forming from such representation a
documentterm matrix and, nally, inferring topics from said matrix. This last step is
performed by matrix decomposition methods such as SVD (a.k.a. LSA[
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]) or
NMF[
        <xref ref-type="bibr" rid="ref13">13</xref>
        ], or by generative probabilistic models such as LDA[
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] or PLSA[
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. The
outcome of topic discovery is a collection of sets of terms, called topics, such that
each document in the corpus can be assigned, often with a certain probability,
to one of the topics.
      </p>
      <p>
        In the above described scenario, the only semantic relations between the terms
that we have access to are those statistically discovered based on the
documentterm matrix. However, in many applications further semantic relations are known
between the terms. For example, if information about synonyms is known, topic
detection can be done by counting occurrences of synsets[
        <xref ref-type="bibr" rid="ref6">6</xref>
        ].
      </p>
      <p>In this work, we aim at incorporating further semantic information into the topic
discovery process by use of a thesaurus. A thesaurus is a controlled vocabulary
whose concepts are organized according to their hypernym/hyponym relations.
In e ect, it is a multihierarchical directed graph, where a node represents a
concept and edges represent hypernym or hyponym relations. Each concept is
assigned one or more labels: strings that can be matched against a document.
1.2</p>
    </sec>
    <sec id="sec-3">
      <title>Topic Evolution</title>
      <p>
        Studying topic evolution can be seen as a study of the history of ideas [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. By
performing topic discovery on several corpora, each of which has a timestamp, it
is possible to see the transitions in interests in the author(s) of the corpora. This
might be useful to discover how a given topic is treated di erently in di erent
times, thus constituting a proxy to study the evolution of semiotics. Several
approaches have been adopted to the study of topic evolution. Some perform
topic discovery independently in every corpus and only afterwards analyze the
relationships between them (e.g. [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]) while others perform topic discovery in a
corpus based on the topics discovered in the previous one (e.g. [
        <xref ref-type="bibr" rid="ref1 ref14 ref15">1, 15, 14</xref>
        ]). The
former approach is subject to the variation inherent to topic discovery methods,
which, in particular, can lead to topics being "lost" from one corpus to the next
due to corpus quality or size. This can lead to the independently discovered
topics being di cult to compare. The latter approach has two main limitations:
1) that new " ash topics" are hard to detect, and 2) they become over-sensible
to parameters, such as thresholds for estimating the number of topics. However,
in the preliminary experiments we have con rmed that dynamic topic models
and plain NMF with subsequent estimation of transition between di erent time
points yield similar results.
      </p>
      <p>In this work, we present an intermediate approach. Topics are discovered
independently for each corpus, and corpora in successive times are analyzed to
determine in which ways did the topics transit into others or appeared de-novo.
This second step allows us to describe the evolution of topics not just as the
evolution of sets of co-occurring terms, but rather as the merging and splitting
of existing topics. Hence, we are enabled to de ne the notion of persistent topic,
i.e. topics that appear in several consecutive corpora.
2</p>
      <sec id="sec-3-1">
        <title>Data</title>
        <p>
          The dataset we analyze is a nancial news data set. The news come from a single
source (Bloomberg news) and are made available as supporting data in[
          <xref ref-type="bibr" rid="ref4">4</xref>
          ]. From
the original dataset, we took articles between January 2009 and November 2013,
which total 447,145 documents. We consider the documents within each week,
starting on Friday, a di erent corpus. Only a subset of the original dataset was
used, in order to guarantee that all corpora have at least 50 documents. The
sizes of corpora range between 50 and 4900, with a mean corpus size of 2589
documents.
        </p>
        <p>
          The semantic relationships between concepts that we are using in this work
are those expressed by skos:narrower and skos:broader predicates in version
9.02 of the STW Thesaurus for Economics[
          <xref ref-type="bibr" rid="ref3">3</xref>
          ]. This thesaurus consists of 6221
concepts, of which 4108 are leaves (i.e. they have no narrower concepts). The
wide range of concepts included in the thesaurus make it ideal for analyzing the
corpus of nancial news, specially because of its concept schemes of Geographic
Names and General Descriptors. For the purposes of this work, the predicate
skos:topConceptOf is considered to be equivalent to skos:broader.
3
        </p>
      </sec>
      <sec id="sec-3-2">
        <title>Methods</title>
        <p>Entity extraction Entity extraction was performed using PoolParty Semantic
Suite1. In brief, the texts are pre-processed in the same manner as the labels
1 poolparty.biz
from the thesaurus, namely, stopwords are removed, tokens are lemmatized,
ngrams up to 4-grams are constituted. Then, a matching is done to identify all the
concepts appearing in the documents. Thus, for every concept c in the thesaurus
and every document, we have computed nd(c), the number of times any label of
c appears in document d. Finally, all the documents corresponding to the same
week were put together into a single corpus, which we will denote by Cw, where
w is the week number.</p>
        <p>Representing documents For each document d in each corpus, two vector
representations are computed.</p>
        <p>The rst, which we call level 0 representation, contains information only about a
subset of all concepts that have no narrower concept in the thesaurus. We call this
set the set of leaves, and denote it by l1; l2; ::::; ; lm0 . The level 0 representation
of a document d is then a m0-dimensional vector V0(d) whose i'th entry is given
by V0(d)[i] = ndn(dli) , where nd is the number of tokens in document d.
The second representation, called level 1 representation, contains only
information of those concepts that are broader concepts of some leaf. If we denote
that set by b1; b2; ::::; bm1 , then this representation consists of a m1 dimensional
vector, whose i'th entry is given by</p>
        <p>V1(d)[i] =</p>
        <p>
          X
l2L(bi)
nd(l)
nd
(1)
where L(c) denotes the set of nodes which are narrower than concept c.
It must be noted that with both of these representations only the occurrences of
leaf concepts is considered. The di erence being that in level 1 representations
the occurrence of concepts that are narrower of one same concept are grouped
together. With these two representations de ned, we can represent each corpus
Cw by two matrices: A0(w) and A1(w), both of which have as many columns as
documents in corpus Cw, but with m0 and m1 rows respectively.
Detecting topics in a week For each week w, we compute a Non-Negative
Matrix Factorization (NMF)[
          <xref ref-type="bibr" rid="ref10">10</xref>
          ] on the two matrices A0(w) and A1(w). NMF
decomposes a matrix A 2 R+m n into the product T S of two matrices, with T 2
R+m k and S 2 Rk+ n. To choose the value of k, we rst compute 1; 2; ::::; n all
singular values of A in descending order, and then choose the k that maximizes
k+k1 k+k+12 . This method is equivalent to the eigenvalue gap trick [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ] in the case
of non-negative matrices. The NMF decompositions were computed using the
scikit-learn library[
          <xref ref-type="bibr" rid="ref12">12</xref>
          ] setting the parameter l1ratio to 1. We followed NMF by
a sparsifying step: the smallest threshold was found (using gradient descent)
such that if the matrix T is the result of setting all entries of T whose value is
lower than to 0, then the density of T S is not more than one half the original
density of A. From now on, we refer to T simply as T .
The use of NMF yields, for each week w two pairs of matrices: T0(w); S0(w),
and T1(w); S1(w). We call the matrices T resulting from NMF the concept-topic
matrices. If T [c; j] &gt; 0, we say that concept c belongs to topic j with a degree of
T [c; j]. Thus, for every week w we can compute two sets of topics, one for each
representation.
        </p>
        <p>Detection of Topic Transitions For a representation q 2 f0; 1g and two
consecutive weeks we get from the previous steps the matrices Tq(w) 2 R+mq k1 and
Tq(w+1) 2 R+mq k2 , where k1 and k2 are the numbers of topics in di erent weeks.
In order to detect transitions between topics in these two consecutive weeks, we
solve the optimization problem of nding the matrix Mq(w) 2 [0; 1]k1 k2 that
minimizes jjTq(w)Mq(w) Tq(w + 1)jj2. The resulting matrix Mq(w), which we
call transition matrix, expresses each topic in week w + 1 as a linear combination
of the topics in week w.
The following points help in the interpretation of transition matrices:
{ If two topics merge into a new topic in the next week, then the new topics
column will have large entries in the rows corresponding to the two previous
topics.
{ A new topic in week w + 1 will correspond to a column whose entries are all
small: it is not similar to any topic in the previous week
{ A topic whose concepts don't change between two consecutive weeks can be
detected by a column and row that both have a single entry close to 1.
{ One can think of topic transition as the process of the topic from the previous
week distributing its weight into the topics of the current week. The topic
cannot give more weight than it has.</p>
        <p>With a set of consecutive transition matrices Mq(w); Mq(w + 1); : : : ; Mq(w + g),
it is possible to detect topics which remain stable for several weeks: they will be a
sequence of indices t1; t2; ::::; tg) such that 1 Mq(w)[ti; ti+1] for i = 1:::g 1
and some small . We consider a topic to be a stable topic if the above condition
holds for g &gt;= 5 with = 0:2, i.e. if the topic keeps approximately with the
same concept composition for at least 4 weeks. Figure 1 is an example of a set
of transitions matrices that exhibit a stable topic.
4</p>
      </sec>
      <sec id="sec-3-3">
        <title>Results</title>
        <p>We decomposed all corpora into topics leading to di erent number of topics per
week (Fig 2). After computing the corresponding transition matrices we are able
to detect the appearance of new topics as well as several stable topics. Among
them, we can talk of persistent and ash topics.</p>
        <p>50
s40
c
i
p
to30
f
o
re20
b
m
uN10
0</p>
        <p>Representation 0</p>
        <p>Representation 1
total
new
s40
c
i
p
to30
f
o
re20
b
m
u10
N
0
total
new
0
50</p>
        <p>100 150
Week number
200
250
0
50</p>
        <p>100 150
Week number
200
250</p>
        <p>Persistent topics are topics that the news source treats regularly. While
interruptions in the data (weeks with few articles) sometimes fragmented these topics,
in the sense of our de nition of stable topics, they were quick to recover. Several
stable topics represented through their most important concepts are presented
in Table 1.
Furthermore, we were able to detect Flash topics, that is topics which relate to
speci c, transient events. We found the following ash topics in level 0
representation:
1. the 2010 South Africa World Football Cup,
2. the 2010 artillery re exchange in the Korean peninsula.
3. the 2011 Drought,
4. the Arab Spring,
5. the Fukushima Daiichi nuclear disaster,
The most frequently mentioned concepts in these topics can be seen in table 2.
The evolution of topics can best be exempli ed by the Football World Cup
example. In table 3, the change in the topic across weeks is shown. Notice how
the number of countries decreases, and those which remain are also those which
remained in the tournament. Let us recall that the tournament ended on the
11th of July.</p>
        <p>Many of these topics were found also in the level 1 representation. In general,
the level 1 representation yields longer stable topics, as can be seen on the length
distributions of stable topics in Figure 3. It is worth mentioning, that a topic
which is very fragmented (in time) in the level 0 representation is less so in the
level 1 representation: that of the Japanese market. This suggests that looking at
more abstract concepts can increase stability of topic detection. It is important
that, while using the broader concept of a set of leaves increases stability, there
is no evidence that similar topics are confounded by this process. This is an
indication that topics are not necessarily matching branches of the taxonomy
but, rather, are combinations of topics from across the thesaurus.
For the same reason, topics in consecutive weeks which are not deemed persistent
under the level 0 representation, become so in the level 1 representation. It is
thus important to choose carefully the level of granularity of the concepts that
will be used to annotate a corpus with the aim of persistent topic detection. It
must also be noted that detecting topics based solely on the words (i.e. without
a controlled vocabulary) does not provide this possibility.</p>
        <p>Finally preliminary results show that, on average, concepts gained by a topic
during its evolution are closer in the thesaurus to already existing topics that is
expected at random.
5</p>
      </sec>
      <sec id="sec-3-4">
        <title>Conclusions and Future Work</title>
        <p>We have presented a method that is able to detect both persistent and
transient topics in news sources. Interestingly, we have shown that it is possible to
detect such topics both when annotating documents only with leaves from a
thesaurus, and when annotating them also with those concepts directly above
leaves. Furthermore, we have shown that the relatively simple NMF method
is able to detect stable topics, and that this stability can also be captured by
our proposed method for computing topic transitions. By considering the news
articles in each week as independent corpora, we are able to detect short-lived
topics, that would otherwise be lost in a global topic modeling.</p>
        <p>The fact that topics are detectable in both representations is an indication that
topics are not easily confounded with each other when one considers more
abstract categories. This is an interesting result, for it shows that the concepts
comprising a topic are distributed widely enough across the thesaurus that
abstracting them just one level still allows for their detection. In a sense, the way
that the concepts have been organized in the thesaurus do not match the
realworld occurring topics. We believe that this result can serve as a measure of the</p>
        <p>Level 0 Representation</p>
        <p>Level 1 Representation
50
40
30
20
10
50
40
30
20
10
0 5 15 25 35 45 55 65 75 85 95</p>
        <p>Stable topic duration
0 5 15 25 35 45 55 65 75 85 95</p>
        <p>Stable topic duration
generality of a thesaurus, and its applicability to analyzing texts from various
sources.</p>
        <p>This work represents initial results in the analysis of topic transitions with the
help of a thesaurus. In future work we intend to build on top of the current
results and extend the methodology. In particular, the current results motivate
us to:
1. Investigate and compare representation of topics in di erent levels. The
preliminary observations suggest that several stable topics at level 0, in
nonoverlapping weeks, could be merged into a single topic at level 1.
2. The topics at level 1 appear to be more stable. This could be useful in the
case of limited data, when the detailed topic could fade away.
3. The di erent levels of representation could prove to be useful for people with
di erent backgrounds. Namely, we expect that more detailed topics could be
of interest for experts in the eld, whereas the more general topics could give
a good overview for less experienced readers.
4. We aim at improving the transition computation with the help of taking the
distances between concepts into account and employing methods similar to
soft cosine similarity.
5. Finally, statistical analysis is required to con rm or observations that the
concepts gained during topic evolution are more likely to be close, in the
thesaurus, to concepts already in a topic.</p>
        <p>Acknowledgements The work is partially supported by the PROFIT (http:
//projectprofit.eu/) project. Part of the European Commission's H2020
Framework Programme, grant agreement no. 687895,</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Blei</surname>
            ,
            <given-names>D.M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>La</surname>
            <given-names>erty</given-names>
          </string-name>
          , J.D.:
          <article-title>Dynamic topic models</article-title>
          .
          <source>In: Proceedings of the 23rd international conference on Machine learning</source>
          . pp.
          <volume>113</volume>
          {
          <fpage>120</fpage>
          .
          <string-name>
            <surname>ACM</surname>
          </string-name>
          (
          <year>2006</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Blei</surname>
            ,
            <given-names>D.M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ng</surname>
            ,
            <given-names>A.Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Jordan</surname>
            ,
            <given-names>M.I.</given-names>
          </string-name>
          :
          <article-title>Latent dirichlet allocation</article-title>
          .
          <source>Journal of machine Learning research 3(Jan)</source>
          ,
          <volume>993</volume>
          {
          <fpage>1022</fpage>
          (
          <year>2003</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Borst</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Neubert</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          :
          <article-title>Case study: Publishing stw thesaurus for economics as linked open data</article-title>
          .
          <source>W3C Semantic Web Use Cases and Case Studies</source>
          (
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Ding</surname>
            ,
            <given-names>X.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zhang</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Liu</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Duan</surname>
            ,
            <given-names>J.:</given-names>
          </string-name>
          <article-title>Using structured events to predict stock price movement: An empirical investigation</article-title>
          .
          <source>In: EMNLP</source>
          . pp.
          <volume>1415</volume>
          {
          <issue>1425</issue>
          (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Djurdjevac</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sarich</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          , Schutte, C.:
          <article-title>Estimating the eigenvalue error of markov state models</article-title>
          .
          <source>Multiscale Modeling &amp; Simulation</source>
          <volume>10</volume>
          (
          <issue>1</issue>
          ),
          <volume>61</volume>
          {
          <fpage>81</fpage>
          (
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Ferrugento</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Alves</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Oliveira</surname>
            ,
            <given-names>H.G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rodrigues</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          :
          <article-title>Towards the improvement of a topic model with semantic knowledge</article-title>
          .
          <source>In: Portuguese Conference on Arti cial Intelligence</source>
          . pp.
          <volume>759</volume>
          {
          <fpage>770</fpage>
          . Springer (
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Hall</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Jurafsky</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Manning</surname>
          </string-name>
          , C.D.:
          <article-title>Studying the history of ideas using topic models</article-title>
          .
          <source>In: Proceedings of the conference on empirical methods in natural language processing</source>
          . pp.
          <volume>363</volume>
          {
          <fpage>371</fpage>
          .
          <article-title>Association for Computational Linguistics (</article-title>
          <year>2008</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Hofmann</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          :
          <article-title>Unsupervised learning by probabilistic latent semantic analysis</article-title>
          .
          <source>Machine learning 42(1)</source>
          ,
          <volume>177</volume>
          {
          <fpage>196</fpage>
          (
          <year>2001</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Landauer</surname>
            ,
            <given-names>T.K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Laham</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Foltz</surname>
            ,
            <given-names>P.W.</given-names>
          </string-name>
          :
          <article-title>Learning human-like knowledge by singular value decomposition: A progress report</article-title>
          .
          <source>In: Advances in neural information processing systems</source>
          . pp.
          <volume>45</volume>
          {
          <issue>51</issue>
          (
          <year>1998</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Lee</surname>
            ,
            <given-names>D.D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Seung</surname>
            ,
            <given-names>H.S.</given-names>
          </string-name>
          :
          <article-title>Learning the parts of objects by non-negative matrix factorization</article-title>
          .
          <source>Nature</source>
          <volume>401</volume>
          (
          <issue>6755</issue>
          ),
          <volume>788</volume>
          (
          <year>1999</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Nguyen</surname>
            ,
            <given-names>D.Q.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Billingsley</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Du</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Johnson</surname>
          </string-name>
          , M.:
          <article-title>Improving topic models with latent feature word representations</article-title>
          .
          <source>Transactions of the Association for Computational Linguistics</source>
          <volume>3</volume>
          ,
          <issue>299</issue>
          {
          <fpage>313</fpage>
          (
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Pedregosa</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Varoquaux</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gramfort</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Michel</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Thirion</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Grisel</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Blondel</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Prettenhofer</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Weiss</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dubourg</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Vanderplas</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Passos</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cournapeau</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Brucher</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Perrot</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Duchesnay</surname>
          </string-name>
          , E.:
          <article-title>Scikit-learn: Machine learning in Python</article-title>
          .
          <source>Journal of Machine Learning Research</source>
          <volume>12</volume>
          ,
          <volume>2825</volume>
          {
          <fpage>2830</fpage>
          (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Recht</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Re</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tropp</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bittorf</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          :
          <article-title>Factoring nonnegative matrices with linear programs</article-title>
          .
          <source>In: Advances in Neural Information Processing Systems</source>
          . pp.
          <volume>1214</volume>
          {
          <issue>1222</issue>
          (
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <surname>Saha</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sindhwani</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          :
          <article-title>Learning evolving and emerging topics in social media: A dynamic nmf approach with temporal regularization</article-title>
          .
          <source>In: Proceedings of the Fifth ACM International Conference on Web Search and Data Mining</source>
          . pp.
          <volume>693</volume>
          {
          <fpage>702</fpage>
          . WSDM '12,
          <string-name>
            <surname>ACM</surname>
          </string-name>
          , New York, NY, USA (
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <surname>Vaca</surname>
            ,
            <given-names>C.K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mantrach</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Jaimes</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Saerens</surname>
            ,
            <given-names>M.:</given-names>
          </string-name>
          <article-title>A time-based collective factorization for topic discovery and monitoring in news</article-title>
          .
          <source>In: Proceedings of the 23rd International Conference on World Wide Web</source>
          . pp.
          <volume>527</volume>
          {
          <fpage>538</fpage>
          . WWW '14,
          <string-name>
            <surname>ACM</surname>
          </string-name>
          , New York, NY, USA (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <surname>Wang</surname>
            ,
            <given-names>X.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>McCallum</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wei</surname>
            ,
            <given-names>X.</given-names>
          </string-name>
          :
          <article-title>Topical n-grams: Phrase and topic discovery, with an application to information retrieval</article-title>
          .
          <source>In: Data Mining</source>
          ,
          <year>2007</year>
          .
          <article-title>ICDM 2007</article-title>
          . Seventh IEEE International Conference on. pp.
          <volume>697</volume>
          {
          <fpage>702</fpage>
          .
          <string-name>
            <surname>IEEE</surname>
          </string-name>
          (
          <year>2007</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>