=Paper=
{{Paper
|id=Vol-2312/CRoNe2018_article_2
|storemode=property
|title=Trending Topics on Science, a Tensor Memory Hypothesis Approach
|pdfUrl=https://ceur-ws.org/Vol-2312/CRoNe2018_article_2.pdf
|volume=Vol-2312
|authors=Felipe Torres
|dblpUrl=https://dblp.org/rec/conf/crone/Torres18
}}
==Trending Topics on Science, a Tensor Memory Hypothesis Approach==
<pdf width="1500px">https://ceur-ws.org/Vol-2312/CRoNe2018_article_2.pdf</pdf>
<pre>
                                       Proceedings of the 4th Congress on Robotics and Neuroscience


                                     Trending Topics on Science, a tensor
                                     memory hypothesis approach
                                     Felipe Torres1*
*For correspondence:
felipe.torrese@sansano.usm.cl (FT)   1 Universidad Técnica Federico Santa María


                                     Abstract The current human knowledge is written. Documenting is the most used manner to
                                     preserve memories and to store fantastic stories. Thus, to distinguish the reality from ﬁction, the
                                     scientiﬁc writing cites previous works moreover than become form experimental setups. Books
                                     and scientiﬁc papers are only a small part of the existent literature but are considered more thrust
                                     as information sources. It is useful to ﬁnd more relations and to know where to focus the lookup of
                                     a topic using the information about the authors and the keywords on the titles and abstracts. This
                                     is possible using relational databases or knowledge graphs, a semantic approach, but with the
                                     tensor memory hypothesis, that adds a temporal dimension, is possible to process the information
                                     with an episodic memory approach. If well, knowledge graphs are of extended use on question
                                     answering and chatbots, they need a previous relational schema generated automatically or
                                     by-hand and stored in an easy-to-query ﬁle format. I use JATS, a standard format that allows
                                     integrating scientiﬁc papers in semantic searches but is not spread on all scientiﬁc publishers, to
                                     extract the markup tags from PDF ﬁles, current year journal articles of one particular topic, and
                                     then construct the tensors memory with their references to extract relations and predictions with
                                     statistical relational learning techniques.


                                     Introduction
                                     Memory is deﬁned as the ability to record information and after recall it. Writing is a human
                                     invention that facilitates this capacity in particular for declarative memories that are facts or events
                                     that can be expressed with language and it could be of two types: semantic or episodic (Tresp et al.,
                                     2017).
                                         The memories and knowledge of humanity are stored on written documents, getting more
                                     reliability if they include references to previous works from others authors. Scientiﬁc articles are
                                     the model of well-structured presentation and storage of information, each one of them with an
                                     own title, explicit authorship, and references to information related to other documents or within
                                     the same document. But, what almost always is relevant for the consideration of reading them,
                                     the retrieval action, is their publishing year. Thus, their ordered structure makes possible to use
                                     them as a representation of global human episodic knowledge and memories. Also, scientiﬁc
                                     publication as a human activity could be modeled as a social network. From this kind of networks
                                     the expression “trending topic” emerged to call the more frequent term or word used in a speciﬁc
                                     temporal window and it is understood as the principal theme or main subject that is related to the
                                     information described in a piece of content.
                                         In a mathematical and computational framework, semantic memories could be represented
                                     as knowledge graphs, where the entities are nodes and the links are relations between them.
                                     A relation between entities is then possible to deﬁne as a triple (𝑠, 𝑝, 𝑜) or as a simple sentence
                                     subject-predicate-objective. An episodic memory adds a time marker, thus a temporal preposi-
                                     tional phrase is added to the simple sentence: subject-predicate-objective-temporal_preposition
  Proceedings of the 4th Congress on Robotics and Neuroscience


or a quad (𝑠, 𝑝, 𝑜, 𝑡). This approach is widely used on semantic web technologies under the Linked
Data methodology (Bizer et al., 2011).
     Thus, it is plausible to use complex networks analysis tools to search for the most relevant
relations between authors, paper titles or keywords. The scientiﬁc publication databases can easily
contain millions of authors, papers and their respective citations. A reduced number of relevant
documents is expected from a speciﬁc topic query, and not thousands of results that search engines
like Google Scholar or publisher’s own engines could generate for a given chain of words. The ﬁeld
of science of science studies these relations and the former works were realized using knowledge
graphs, that are expressed as adjacency matrices. If the temporal dimension and various types
of relationships are considered, then its possible to form tensors of fourth order. A matrix 𝑋
of the network could be bipartite (𝑋 ∈ ℝ𝑛×𝑚 ) if there are two types of nodes (authors-articles,
authors-words, articles-words) or monopartite (𝑋 ∈ ℝ𝑛×𝑛 ); unweighted ( 𝑥𝑖𝑗 ∈ {0, 1}) or weighted
(𝑥𝑖𝑗 ∈ ℝ), directed or undirected (𝑋 𝑇 = 𝑋) (Zeng et al., 2017).
     (Tresp and Ma, 2017) introduced the Tensor Memory Hypothesis, where a knowledge graph is
represented by a Tucker decomposition of the tensors. It is based on representational learning, i.e,
a discrete entity 𝑒 is associated with a vector of real numbers 𝐚𝐞 called latent variables. (Tresp and
Ma, 2017) also argue that representational learning might also be the basis for perception, planning
and decision making. From a physiological point of view, there is evidence that the hippocampus
plays a central role in the temporal organization of memories and supports the disambiguation
of overlapping episodes (Eichenbaum, 2014a), then in the standard consolidation of memory
theory (SCT), the episodic memory is a neocortical representation that arises from hippocampal
activity while in the multiple trace theory (MTT) the episodic memory is only represented on the
hippocampus and is used to form semantic memories on the neocortex. Also, there is evidence of
the existence of “place cells” and “time cells”in the hippocampus and that these support associative
networks that represent spatiotemporal relations between the entities of memories (Eichenbaum,
2014b).
                                                         There are some previous works on trending
Table 1. PCA variance for the number of latent       or hot topics in science: (Griﬃths and Steyvers,
components.                                          2004) used Latent Dirichlet Allocation (LDA) to
                                                     analyze the abstracts from Proceedings on the Na-
  Latent Components PCA variance (%)                 tional Academy of Sciences (PNAS) from 1991 to
             3                     2.93              2001. (Wei et al., 2013) performed a statistical
             5                      4.3              analysis to ﬁnd if scientists follow hot topics on
            10                     7.32              their investigations, they used published papers
            15                    10.03              from the American Physical Society (APS) Physical
            20                     12.5              Review journals beginning in 1976 and ending
            25                     14.8              in 2009. (Kang and Lin, 2018) used non-smooth
            50                    24.99              non-negative matrix factorization (snNMF) to ex-
           100                    41.88              tract the more prominent topics from a dataset of
           200                    63.9               keywords from scientiﬁc articles related to "Ma-
                                                     chine Learning" from 2014 to 2016 in arXiv.org
                                                     stat.ML, the similarity of this work with the Tensor
Memory Hypothesis belongs to the use of matrix decomposition to reduce the rank of the matrix.
(Alshareef et al., 2018) indexes based on cosine similarity to estimate a score that represents the
anticipation of a prospective relationship between authors. They used two subsets of the IEEE digital
library containing the keywords “database” and “multimedia”.


Results
The quantity of latent components is not associated with a speciﬁc statistical measure of data.
However, to have an approach, table 1 presents the correspondent percentage of variance if the
same number of PCA components were employed.
  Proceedings of the 4th Congress on Robotics and Neuroscience


Table 2. Most probable words for the query with an entity type.


                                                    Entity Type
 Latent Components        Authors                   Articles               Words
          𝟥               neuromodulation           neuromodulation        neuromodulation
          𝟧               stimulus, presented       stimulus, presented    stimulus, technique
         𝟣𝟢               presented                 presented              presented
         𝟣𝟧               sleep, memory             sleep                  sleep
         𝟤𝟢               stimulus, memory          stimulus, cued         stimulus, cued
         𝟤𝟧               memory, sws               memory, spatial, sws   memory, sws
         𝟧𝟢               sleep, stimulus           sleep, stimulus        sleep, stimulus
        𝟣𝟢𝟢               assr, memory              assr, memory           assr, memory
        𝟤𝟢𝟢               wireless, monitoring      sleep, slow            sleep, slow


Table 3. Most probable word with NMF decomposition.


                                                        Entity Type
 Latent Components        Authors                       Articles                  Words
          𝟥               slow, sleep, auditory         stimulation, sleep        sleep, memory
          𝟧               spindles, auditory, sleep     sleep                     sleep
         𝟣𝟢               sleep, stimulation            sleep, stimulation        sleep, memory
         𝟣𝟧               sleep, memory                 brain, consolidation      sleep, memory
         𝟤𝟢               sleep, memory                 oscillations, sleep       sleep, memory
         𝟤𝟧               sleep, stimulation            activity, memory          sleep, memory
         𝟧𝟢               sleep, memory                 oscillations, humans      sleep, memory
        𝟣𝟢𝟢               sleep, role                   reactivation, slow-wave   sleep, memory
        𝟤𝟢𝟢               sleep, slow                   sleep, brain              sleep, memory


   The words with more relations in the complete tensor, before decomposition, are sleep,
memory, stimulation, slow, brain, consolidation, auditory, spindles, reactivation, and
activity. Table 2 is populated using a selection strategy of most frequently word from queries of
the type
                                   𝑤𝑜𝑟𝑑 𝑖 = 𝑎𝑟𝑔𝑚𝑎𝑥𝑜 {𝑃 (𝑠, 𝑜, 𝑡)},                            (1)
where 𝑠 is each author, paper title or word in the database, 𝑜 a word, 𝑡 a year and, 𝑖 is the index of a
entity .
    The most probable words, from the same queries, using more latent components are more
than using a few latent variables. For example, there are 21 different words from query results
using 200 latent components. In the other hand for few latent components, the results of queries
are only the words shown in table 2.
    Table 3 is populated using the of NMF decomposition in the collapsed on time matrix, adding
the weights of each year. The more frequently words are selected from which are maximum for
each topic or k-row in the matrix H of the decompositions. The same processing using nsNMF
decomposition results with the words sleep and memory as the most probable for all the cases.
    The analysis of relationships between entities needs a metric of distance. Each entity is rep-
resented by latent vectors, then one metric selection could be the Euclidean distance but given
this particular type of data, content from documents, the usual metric employed is the cosine
similarity. However, the use of distances on the original data space demand high computational
costs, the use of a reduced space alleviates the computational cost of calculating distances but
requires a previous high cost of space transformation. Figure 1 is an example of the Euclidean
         Proceedings of the 4th Congress on Robotics and Neuroscience


distance and cosine similarity that was extracted from the 𝐑 tensor of the RESCAL factorization. The
difference between the years of sources and the years of only cited papers is most evident with
less latent components. Moreover, the similarity is greater, then lesser Euclidean distance, between
the entities of the previous years.


Discussion
There are scientiﬁc papers meta-data databases or it is possible to extract article’s meta-data from
a speciﬁc journal or publisher. But in practice, it is usual to have few references from a previous
search and they are from different journals or publishers, then to extract the meta-data I used
the JATS format 1 , a semantic web standard format for scientiﬁc papers popularized by National
Center for Biotechnology Information (NCBI). A most popular format is the Resource Description
Framework (RDF) and various scientiﬁc publishers are adopting this one.
    The analysis of the statistical features of the tensor without any other process could give in-
formation of the most related entities, as the most cited author, most cited article or most used
word in each slice of time. However, employing a tensor decomposition technique allows the
use of a latent components space, where more information could be extracted given that the
relationships are expressed in fewer variables, thus, clustering some properties of data. This work
is an example of how from a small sample of documents with a known relationship between
them, the topic was already known, some words that are not the most frequent could be ex-
tracted and provide a new perspective of the topics covered on the documents. The ﬁgure 1 is
an example of extracted information that is not easy to visualize in the original space of the data.


A         0
                3
                        1.0   0
                                       25
                                            1.0     0
                                                        200   The comparison of different tensor decompo-
                                                              1.0
                        0.9
                                            0.8
                                            0.6           sitions and the search for the optimum number
                                                              0.8
 years


                                            0.4               0.6
    5            0.8   5                 5                    0.4
                 0.7
                                            0.2
                                            0.0           of latent components is work to be done to take
                                                              0.2
                 0.6                         0.2
   10                 10                10   0.4             0.0
      0       10         0       10        0       10     advantage of relational data, that due to semantic
        years              years             years
B         3                 25                200
                                                          web technologies is not restricted only to formal
    0            0.14
                 0.12
                       0            0.7
                                    0.6
                                         0            0.6
                                                      0.5
                                                          scientiﬁc documents and it is available for various
                 0.10               0.5
 years


                                                      0.4
    5            0.08
                 0.06
                       5            0.4
                                    0.3
                                         5            0.3 type of data. Also, the proposal of (Tresp et al.,
                 0.04               0.2               0.2
                                                      0.1
   10
                 0.02
                 0.00 10
                                    0.1
                                    0.0 10            0.0 2017) of considering the knowledge graphs as
      0       10         0       10        0       10
        years              years             years        semantic and episodic memories allows having
                                                          a framework that links computational memory
Figure 1. Distance metrics on the latent space. A.
Cosine similarity between years with 3, 25, and 200
                                                          with the biological one. Its capacities and defects
latent components. B. Euclidean distance between          need to be explored. Curiously, the etymology of
years with 3, 25, and 200 latent components.              “topic” comes from the Greek topos or place, that
                                                          as memory is other of the known hippocampus
                                                          cognitive functions.
                                                              Finally, from the results obtained is evident
that sleep and memory are the most relevant words of the selected papers, these words and slow
are the few words that are the result from queries too with RESCAL decomposition. The nsNMF
decomposition gives for any number of components the same words, then it is more robust to the
change in the number of components.


Methods and Materials
Data extraction
The meta-data from 11 articles from different publishers (Table 4) related to “Stimulation during
NREM sleep” in PDF ﬁles was obtained using the software CERMINE (Tkaczyk et al., 2015) and
stored in JATS format. After, with a Python script, the own title, authors and abstract were extracted
and also the title and authors of references inside the time range 2008-2018. Later, the titles and
         1 https://jats.nlm.nih.gov/
  Proceedings of the 4th Congress on Robotics and Neuroscience


Table 4. Most probable word with NMF decomposition.


 Year    Authors                 Title
  2016   Batterink et. al        Phase of Spontaneous Slow Oscillations during Sleep Inﬂuences Memory-Related Processing of Auditory Cues

  2016   Weigenand et. al        Timing matters: open-loop stimulation does not improve overnight consolidation of word pairs in humans

  2017   Besedovsky et. al       Auditory closed-loop stimulation of EEG slow oscillations strengthens sleep and signs of its immune-supportive function

  2017   Lafon et. al            Low frequency transcranial electrical stimulation does not entrain sleep rhythms measured by human intracranial recordings

  2017   Leminen et. al          Enhanced Memory Consolidation Via Automatic Sound Stimulation During Non-REM Sleep

  2017   Lustenberger et. al     High-density EEG characterization of brain responses to auditory rhythmic stimuli during wakefulness and NREM sleep

  2017   Oyarzun et. al          Targeted Memory Reactivation during Sleep Adaptively Promotes the Strengthening or Weakening of Overlapping Memories

  2017   Kinzing et. al          Odor cueing during slow-wave sleep beneﬁts memory independently of low cholinergic tone

  2018   Ashton et. al           No effect of targeted memory reactivation during slow-wave sleep on emotional recognition memory

  2018   Debellemaniere et. al   Performance of an Ambulatory Dry-EEG Device for Auditory Closed-Loop Stimulation of Sleep Slow Oscillations in the Home Environment

  2018   Ezzyat et. al           Closed-loop stimulation of temporal cortex rescues functional networks and improves memory


abstracts were tokenized and semantically tagged, using nltk library, to extract the adjectives and
nouns that are considered the principal terms of the articles. For de-duplicating authors, all names
are formatted to “(Last name) (First name initial.) (Middle name initial.)” For de-duplication of titles
and words, all words were transformed to lowercase and special characters were eliminated.
    For each year, a zeros square matrix 𝑋𝑘 ∈ ℝ(𝑛𝑎 +𝑛𝑡 +𝑛𝑤 )×(𝑛𝑎 +𝑛𝑡 +𝑛𝑤 ) was populated with weighted and
directed values using the next rules only in the 𝑘-year correspondent to relations:
                                          𝐴𝑢𝑡ℎ𝑜𝑟𝑖 co-wrote with 𝐴𝑢𝑡ℎ𝑜𝑟𝑗                                      𝑥𝑘𝑖𝑗 + = 2
                                                  𝐴𝑢𝑡ℎ𝑜𝑟𝑖 cited 𝐴𝑢𝑡ℎ𝑜𝑟𝑗                                     𝑥𝑘𝑖,𝑗 + = 1
                                                 𝐴𝑢𝑡ℎ𝑜𝑟𝑖 wrote 𝐴𝑟𝑡𝑖𝑐𝑙𝑒𝑗                                     𝑥𝑘𝑖,𝑗 + = 2
                                                  𝐴𝑢𝑡ℎ𝑜𝑟𝑖 cited 𝐴𝑟𝑡𝑖𝑐𝑙𝑒𝑗                                    𝑥𝑘𝑖,𝑗 + = 1
                                                  𝐴𝑢𝑡ℎ𝑜𝑟𝑖 wrote 𝑊 𝑜𝑟𝑑𝑗                                      𝑥𝑘𝑖,𝑗 + = 1
                                                  𝐴𝑟𝑡𝑖𝑐𝑙𝑒𝑖 cited 𝐴𝑟𝑡𝑖𝑐𝑙𝑒𝑗                                   𝑥𝑘𝑖,𝑗 + = 1
                                              𝐴𝑟𝑡𝑖𝑐𝑙𝑒𝑖 contained 𝑊 𝑜𝑟𝑑𝑗                                     𝑥𝑘𝑖,𝑗 + = 1
                                 𝑊 𝑜𝑟𝑑𝑖 is in the same document of 𝑊 𝑜𝑟𝑑𝑗                                   𝑥𝑘𝑖,𝑗 + = 1

   This approach for expressing the relations simpliﬁes the tensor representation because the
dimension correspondent to predicate are intrinsic on the weighted values and allows the use of
RESCAL factorization. (Ma et al., 2018) explain other tensor decomposition methods that could be
used to get the latent components.

Cosine similarity
The cosine similarity is an adequate distance metric for vectors where the magnitude is dependent
on the size of the sample, as the frequency of words in a document.
                                                √
                                                  𝑥⋅𝑦
                                            𝑠=                                                 (2)
                                               ‖𝑥‖2 ‖𝑦‖2

Tensor memory hypothesis
A fourth order tensor could be decomposed as

                                                     𝑋 ≈ 𝐺 ×1 𝐴𝑠 ×2 𝐴𝑝 ×3 𝐴𝑜 ×4 𝐴𝑡                                                                            (3)

The probability of the existence of the relationship between the entities of a quad is given by

                                                         𝑃 ((𝑠, 𝑝, 𝑜, 𝑡)) = 𝑠𝑖𝑔(𝜃𝑠,𝑝,𝑜,𝑡 ),                                                                   (4)

Where
                                                                                  1
                                                                𝑠𝑖𝑔(𝑥) =                                                                                      (5)
                                                                               1 + 𝑒−𝑥
  Proceedings of the 4th Congress on Robotics and Neuroscience


                                                   𝜃𝑠,𝑝,𝑜,𝑡 = 𝑓 𝑒 (𝐚𝑒𝑠 , 𝐚𝑒𝑝 , 𝐚𝑒𝑜 , 𝐚𝑒𝑡 ),                                                   (6)


                                                   ∑
                                                   𝑟̄
                                                      ∑
                                                      𝑟̄
                                                         ∑
                                                         𝑟̄
                                                            ∑
                                                            𝑟̄
                  𝑓 𝑒 (𝐚𝑒𝑠 , 𝐚𝑒𝑝 , 𝐚𝑒𝑜 , 𝐚𝑒𝑡 ) =                             𝑎𝑒𝑠 ,𝑟1 𝑎𝑒𝑝 ,𝑟2 𝑎𝑒𝑜 ,𝑟3 𝑎𝑒𝑡 ,𝑟4 𝑔 𝑒 (𝑟1 , 𝑟2 , 𝑟3 , 𝑟4 ).        (7)
                                                   𝑟1 =1 𝑟2 =1 𝑟3 =1 𝑟4 =1

    The analysis of tensors, as for matrices, is possible to perform using a reduced form obtained by
factorization. One popular factorization method of tensors is the Tucker representation, however,
there are other matrices and tensor decomposition algorithms. Here, I used RESCAL and the
construction of the tensor with weighted values allow to omit the predicate dimension, then the
characteristic function becomes

                                                         ∑
                                                         𝑟̄
                                                            ∑
                                                            𝑟̄
                                                               ∑
                                                               𝑟̄
                             𝑓 𝑒 (𝐚𝑒𝑠 , 𝐚𝑒𝑜 , 𝐚𝑒𝑡 ) =                        𝑎𝑒𝑠 ,𝑟1 𝑎𝑒𝑜 ,𝑟2 𝑎𝑒𝑡 ,𝑟3 𝑔 𝑒 (𝑟1 , 𝑟2 , 𝑟3 ).                     (8)
                                                        𝑟1 =1 𝑟2 =1 𝑟3 =1


RESCAL
This tensor decomposition was proposed by (Nickel, 2013). The decomposed tensor needs to have
two dimensions of the same size, i.e., 𝐗 ∈ ℝ𝑛×𝑛×𝑚 and the results are a matrix 𝐴 ∈ ℝ𝑛×𝑟 and a tensor
𝐑 ∈ ℝ𝑟×𝑟×𝑚 .
                                          𝑋 ≈ 𝑅 ×1 𝐴 ×2 𝐴,
                                                                                                  (9)
                                            𝑋𝑘 = 𝐴𝑟𝑘 𝐴𝑇 .
The algorithm is an alternating least squares (ALS) procedure where the outputs are updated with:
                       ( 𝑚                                   )( 𝑚                                                              )−1
                        ∑                                      ∑
                𝐴←              𝑋𝑘 𝐴𝑅𝑇𝑘 + 𝑋𝑘𝑇 𝐴𝑅𝑘                         𝑅𝑘 𝐴     𝑇
                                                                                       𝐴𝑅𝑇𝑘 + 𝑅𝑇𝑘 𝐴𝑇 𝐴𝑅𝑘 + 𝜆𝐴 𝐼                          ,   (10)
                          𝑘=1                                       𝑘=1

                                                         (            )
                                                   𝑅𝑘 ← 𝑉 𝑃 ∗ 𝑈 𝑇 𝑋𝑘 𝑈 𝑉 𝑇 ,                                                                 (11)

Where 𝑅𝑘 is a slice of the tensor 𝐑 and for optimization a singular value decomposition of matrix 𝐴
is employed. 𝑃 is the matrix such that 𝑑𝑖𝑎𝑔(𝑣𝑒𝑐(𝑃 )) = 𝑆,̂ which can be constructed by rearranging
the diagonal entries of 𝑆̂ via the inverse vectorization operator 𝑣𝑒𝑐𝑟−1 (⋅)

                                                               𝐴 = 𝑈 Σ𝑉 𝑇 ,                                                                  (12)

Then, for regularization, the Kronecker product of the diagonal matrix is employed.

                                                                𝑆 =Σ⊗Σ                                                                       (13)


                                                                             𝑆𝑖𝑖
                                                             𝑆̂𝑖𝑖 =                                                                          (14)
                                                                      𝑆𝑖𝑖2 + 𝜆𝑅

Non-negative Matrix Factorization (NMF)
This matrix factorization method ﬁnds two matrices 𝑊 ∈ ℝ𝑛×𝑟 and 𝐻 ∈ ℝ𝑟×𝑚 which multiplication
minimizes the Froebenius norm with the original matrix 𝑋 ∈ ℝ𝑛×𝑚 .

                                                                𝑋 ≈ 𝑊 𝐻,                                                                     (15)

The updates using the algorithm proposed by (Lee and Seung, 2001) are:

                                                                        𝑋𝐻 𝑇
                                                      𝑊 ←𝑊                       ,                                                           (16)
                                                                      𝑊 𝐻𝐻 𝑇 + 𝛿

                                                                          𝑊 𝑇𝑋
                                                      𝐻 ←𝐻                                  .                                                (17)
                                                                    𝑊 𝑇𝑊 𝐻 + 𝛿
  Proceedings of the 4th Congress on Robotics and Neuroscience


Non-smooth Non-negative Matrix Factorization (nsNMF)
This decomposition is a modiﬁcation of NMF proposed by (Kang and Lin, 2018).

                                                  𝑋 ≈ 𝑊 𝑆𝐻,                                                  (18)

Where
                                                                   𝜃 𝑇
                                             𝑆 = (1 − 𝜃)𝐼 +          11 ,                                    (19)
                                                                   𝑘
                                                    ( 𝑚            )
                                                     ∑
                                               𝐷=           𝐻𝑖,𝑗     𝐼,                                      (20)
                                                      𝑗=1

And using
                                               𝑊 = 𝑊ℎ 𝐷−1 𝑆 −1 ,                                             (21)
   Finally, the matrix decomposition could be expressed as

                                             𝑋 ≈ 𝑊 𝐷−1 𝑆 −1 𝑆𝐷𝐻.                                             (22)


Funding
This work was supported by Beca Doctorado Nacional Conicyt, Folio No 21180640.


References
Alshareef AM, Alhamid MF, El Saddik A. Recommending Scientiﬁc Collaboration Based on Topical, Authors and
  Venues Similarities. 2018 IEEE International Conference on Information Reuse and Integration (IRI). 2018; p.
  55–61. https://ieeexplore.ieee.org/document/8424687/, doi: 10.1109/IRI.2018.00016.

Bizer C, Heath T, Berners-Lee T. Linked data: The story so far. In: Semantic services, interoperability and web
  applications: emerging concepts IGI Global; 2011.p. 205–227.

Eichenbaum H.          Memory on time.          Trends in Cognitive Sciences. 2014;         17(2):81–88.      doi:
   10.1016/j.tics.2012.12.007.Memory.

Eichenbaum H. Time cells in the hippocampus: A new dimension for mapping memories. Nature Reviews
   Neuroscience. 2014; 15(11):732–744. doi: 10.1038/nrn3827.

Griﬃths TL, Steyvers M. Finding scientiﬁc topics. Proceedings of the National academy of Sciences. 2004;
  101(suppl 1):5228–5235.

Kang Y, Lin Kp. Topic Diffusion Discovery based on Sparseness-constrained Non-negative Matrix Factorization. .
  2018; doi: 10.1109/IRI.2018.00021.

Lee D, Seung H. Algorithms for non-negative matrix factorization. Advances in neural information processing sys-
  tems. 2001; (1):556–562. http://papers.nips.cc/paper/1861-algorithms-for-non-negative-matrix-factorization,
  doi: 10.1109/IJCNN.2008.4634046.

Ma Y, Tresp V, Daxberger E. Embedding Models for Episodic Memory. . 2018 jun; http://arxiv.org/abs/1807.00228.

Nickel M. Tensor Factorization for Relational Learning. . 2013; p. 161. http://nbn-resolving.de/urn:nbn:de:bvb:
  19-160568.

Tkaczyk D, Szostek P, Fedoryszak M, Dendek PJ, Bolikowski Ł. CERMINE: automatic extraction of structured
  metadata from scientiﬁc literature. International Journal on Document Analysis and Recognition (IJDAR). 2015;
  18(4):317–335.

Tresp V, Ma Y. The Tensor Memory Hypothesis. . 2017; http://arxiv.org/abs/1708.02918.

Tresp V, Ma Y, Baier S, Yang Y. Embedding learning for declarative memories. Lecture Notes in Computer
  Science (including subseries Lecture Notes in Artiﬁcial Intelligence and Lecture Notes in Bioinformatics). 2017;
  10249 LNCS:202–216. doi: 10.1007/978-3-319-58068-5_13.

Wei T, Li M, Wu C, Yan XY, Fan Y, Di Z, Wu J. Do scientists trace hot topics? Scientiﬁc Reports. 2013; 3:3–7. doi:
 10.1038/srep02207.

Zeng A, Shen Z, Zhou J, Wu J, Fan Y, Wang Y, Stanley HE. The science of science: From the perspective of
  complex systems. Physics Reports. 2017; 714-715:1–73. https://doi.org/10.1016/j.physrep.2017.10.001, doi:
  10.1016/j.physrep.2017.10.001.

</pre>