=Paper=
{{Paper
|id=Vol-2312/CRoNe2018_article_2
|storemode=property
|title=Trending Topics on Science, a Tensor Memory Hypothesis Approach
|pdfUrl=https://ceur-ws.org/Vol-2312/CRoNe2018_article_2.pdf
|volume=Vol-2312
|authors=Felipe Torres
|dblpUrl=https://dblp.org/rec/conf/crone/Torres18
}}
==Trending Topics on Science, a Tensor Memory Hypothesis Approach==
Proceedings of the 4th Congress on Robotics and Neuroscience Trending Topics on Science, a tensor memory hypothesis approach Felipe Torres1* *For correspondence: felipe.torrese@sansano.usm.cl (FT) 1 Universidad Técnica Federico Santa María Abstract The current human knowledge is written. Documenting is the most used manner to preserve memories and to store fantastic stories. Thus, to distinguish the reality from fiction, the scientific writing cites previous works moreover than become form experimental setups. Books and scientific papers are only a small part of the existent literature but are considered more thrust as information sources. It is useful to find more relations and to know where to focus the lookup of a topic using the information about the authors and the keywords on the titles and abstracts. This is possible using relational databases or knowledge graphs, a semantic approach, but with the tensor memory hypothesis, that adds a temporal dimension, is possible to process the information with an episodic memory approach. If well, knowledge graphs are of extended use on question answering and chatbots, they need a previous relational schema generated automatically or by-hand and stored in an easy-to-query file format. I use JATS, a standard format that allows integrating scientific papers in semantic searches but is not spread on all scientific publishers, to extract the markup tags from PDF files, current year journal articles of one particular topic, and then construct the tensors memory with their references to extract relations and predictions with statistical relational learning techniques. Introduction Memory is defined as the ability to record information and after recall it. Writing is a human invention that facilitates this capacity in particular for declarative memories that are facts or events that can be expressed with language and it could be of two types: semantic or episodic (Tresp et al., 2017). The memories and knowledge of humanity are stored on written documents, getting more reliability if they include references to previous works from others authors. Scientific articles are the model of well-structured presentation and storage of information, each one of them with an own title, explicit authorship, and references to information related to other documents or within the same document. But, what almost always is relevant for the consideration of reading them, the retrieval action, is their publishing year. Thus, their ordered structure makes possible to use them as a representation of global human episodic knowledge and memories. Also, scientific publication as a human activity could be modeled as a social network. From this kind of networks the expression “trending topic” emerged to call the more frequent term or word used in a specific temporal window and it is understood as the principal theme or main subject that is related to the information described in a piece of content. In a mathematical and computational framework, semantic memories could be represented as knowledge graphs, where the entities are nodes and the links are relations between them. A relation between entities is then possible to define as a triple (𝑠, 𝑝, 𝑜) or as a simple sentence subject-predicate-objective. An episodic memory adds a time marker, thus a temporal preposi- tional phrase is added to the simple sentence: subject-predicate-objective-temporal_preposition Proceedings of the 4th Congress on Robotics and Neuroscience or a quad (𝑠, 𝑝, 𝑜, 𝑡). This approach is widely used on semantic web technologies under the Linked Data methodology (Bizer et al., 2011). Thus, it is plausible to use complex networks analysis tools to search for the most relevant relations between authors, paper titles or keywords. The scientific publication databases can easily contain millions of authors, papers and their respective citations. A reduced number of relevant documents is expected from a specific topic query, and not thousands of results that search engines like Google Scholar or publisher’s own engines could generate for a given chain of words. The field of science of science studies these relations and the former works were realized using knowledge graphs, that are expressed as adjacency matrices. If the temporal dimension and various types of relationships are considered, then its possible to form tensors of fourth order. A matrix 𝑋 of the network could be bipartite (𝑋 ∈ ℝ𝑛×𝑚 ) if there are two types of nodes (authors-articles, authors-words, articles-words) or monopartite (𝑋 ∈ ℝ𝑛×𝑛 ); unweighted ( 𝑥𝑖𝑗 ∈ {0, 1}) or weighted (𝑥𝑖𝑗 ∈ ℝ), directed or undirected (𝑋 𝑇 = 𝑋) (Zeng et al., 2017). (Tresp and Ma, 2017) introduced the Tensor Memory Hypothesis, where a knowledge graph is represented by a Tucker decomposition of the tensors. It is based on representational learning, i.e, a discrete entity 𝑒 is associated with a vector of real numbers 𝐚𝐞 called latent variables. (Tresp and Ma, 2017) also argue that representational learning might also be the basis for perception, planning and decision making. From a physiological point of view, there is evidence that the hippocampus plays a central role in the temporal organization of memories and supports the disambiguation of overlapping episodes (Eichenbaum, 2014a), then in the standard consolidation of memory theory (SCT), the episodic memory is a neocortical representation that arises from hippocampal activity while in the multiple trace theory (MTT) the episodic memory is only represented on the hippocampus and is used to form semantic memories on the neocortex. Also, there is evidence of the existence of “place cells” and “time cells”in the hippocampus and that these support associative networks that represent spatiotemporal relations between the entities of memories (Eichenbaum, 2014b). There are some previous works on trending Table 1. PCA variance for the number of latent or hot topics in science: (Griffiths and Steyvers, components. 2004) used Latent Dirichlet Allocation (LDA) to analyze the abstracts from Proceedings on the Na- Latent Components PCA variance (%) tional Academy of Sciences (PNAS) from 1991 to 3 2.93 2001. (Wei et al., 2013) performed a statistical 5 4.3 analysis to find if scientists follow hot topics on 10 7.32 their investigations, they used published papers 15 10.03 from the American Physical Society (APS) Physical 20 12.5 Review journals beginning in 1976 and ending 25 14.8 in 2009. (Kang and Lin, 2018) used non-smooth 50 24.99 non-negative matrix factorization (snNMF) to ex- 100 41.88 tract the more prominent topics from a dataset of 200 63.9 keywords from scientific articles related to "Ma- chine Learning" from 2014 to 2016 in arXiv.org stat.ML, the similarity of this work with the Tensor Memory Hypothesis belongs to the use of matrix decomposition to reduce the rank of the matrix. (Alshareef et al., 2018) indexes based on cosine similarity to estimate a score that represents the anticipation of a prospective relationship between authors. They used two subsets of the IEEE digital library containing the keywords “database” and “multimedia”. Results The quantity of latent components is not associated with a specific statistical measure of data. However, to have an approach, table 1 presents the correspondent percentage of variance if the same number of PCA components were employed. Proceedings of the 4th Congress on Robotics and Neuroscience Table 2. Most probable words for the query with an entity type. Entity Type Latent Components Authors Articles Words 𝟥 neuromodulation neuromodulation neuromodulation 𝟧 stimulus, presented stimulus, presented stimulus, technique 𝟣𝟢 presented presented presented 𝟣𝟧 sleep, memory sleep sleep 𝟤𝟢 stimulus, memory stimulus, cued stimulus, cued 𝟤𝟧 memory, sws memory, spatial, sws memory, sws 𝟧𝟢 sleep, stimulus sleep, stimulus sleep, stimulus 𝟣𝟢𝟢 assr, memory assr, memory assr, memory 𝟤𝟢𝟢 wireless, monitoring sleep, slow sleep, slow Table 3. Most probable word with NMF decomposition. Entity Type Latent Components Authors Articles Words 𝟥 slow, sleep, auditory stimulation, sleep sleep, memory 𝟧 spindles, auditory, sleep sleep sleep 𝟣𝟢 sleep, stimulation sleep, stimulation sleep, memory 𝟣𝟧 sleep, memory brain, consolidation sleep, memory 𝟤𝟢 sleep, memory oscillations, sleep sleep, memory 𝟤𝟧 sleep, stimulation activity, memory sleep, memory 𝟧𝟢 sleep, memory oscillations, humans sleep, memory 𝟣𝟢𝟢 sleep, role reactivation, slow-wave sleep, memory 𝟤𝟢𝟢 sleep, slow sleep, brain sleep, memory The words with more relations in the complete tensor, before decomposition, are sleep, memory, stimulation, slow, brain, consolidation, auditory, spindles, reactivation, and activity. Table 2 is populated using a selection strategy of most frequently word from queries of the type 𝑤𝑜𝑟𝑑 𝑖 = 𝑎𝑟𝑔𝑚𝑎𝑥𝑜 {𝑃 (𝑠, 𝑜, 𝑡)}, (1) where 𝑠 is each author, paper title or word in the database, 𝑜 a word, 𝑡 a year and, 𝑖 is the index of a entity . The most probable words, from the same queries, using more latent components are more than using a few latent variables. For example, there are 21 different words from query results using 200 latent components. In the other hand for few latent components, the results of queries are only the words shown in table 2. Table 3 is populated using the of NMF decomposition in the collapsed on time matrix, adding the weights of each year. The more frequently words are selected from which are maximum for each topic or k-row in the matrix H of the decompositions. The same processing using nsNMF decomposition results with the words sleep and memory as the most probable for all the cases. The analysis of relationships between entities needs a metric of distance. Each entity is rep- resented by latent vectors, then one metric selection could be the Euclidean distance but given this particular type of data, content from documents, the usual metric employed is the cosine similarity. However, the use of distances on the original data space demand high computational costs, the use of a reduced space alleviates the computational cost of calculating distances but requires a previous high cost of space transformation. Figure 1 is an example of the Euclidean Proceedings of the 4th Congress on Robotics and Neuroscience distance and cosine similarity that was extracted from the 𝐑 tensor of the RESCAL factorization. The difference between the years of sources and the years of only cited papers is most evident with less latent components. Moreover, the similarity is greater, then lesser Euclidean distance, between the entities of the previous years. Discussion There are scientific papers meta-data databases or it is possible to extract article’s meta-data from a specific journal or publisher. But in practice, it is usual to have few references from a previous search and they are from different journals or publishers, then to extract the meta-data I used the JATS format 1 , a semantic web standard format for scientific papers popularized by National Center for Biotechnology Information (NCBI). A most popular format is the Resource Description Framework (RDF) and various scientific publishers are adopting this one. The analysis of the statistical features of the tensor without any other process could give in- formation of the most related entities, as the most cited author, most cited article or most used word in each slice of time. However, employing a tensor decomposition technique allows the use of a latent components space, where more information could be extracted given that the relationships are expressed in fewer variables, thus, clustering some properties of data. This work is an example of how from a small sample of documents with a known relationship between them, the topic was already known, some words that are not the most frequent could be ex- tracted and provide a new perspective of the topics covered on the documents. The figure 1 is an example of extracted information that is not easy to visualize in the original space of the data. A 0 3 1.0 0 25 1.0 0 200 The comparison of different tensor decompo- 1.0 0.9 0.8 0.6 sitions and the search for the optimum number 0.8 years 0.4 0.6 5 0.8 5 5 0.4 0.7 0.2 0.0 of latent components is work to be done to take 0.2 0.6 0.2 10 10 10 0.4 0.0 0 10 0 10 0 10 advantage of relational data, that due to semantic years years years B 3 25 200 web technologies is not restricted only to formal 0 0.14 0.12 0 0.7 0.6 0 0.6 0.5 scientific documents and it is available for various 0.10 0.5 years 0.4 5 0.08 0.06 5 0.4 0.3 5 0.3 type of data. Also, the proposal of (Tresp et al., 0.04 0.2 0.2 0.1 10 0.02 0.00 10 0.1 0.0 10 0.0 2017) of considering the knowledge graphs as 0 10 0 10 0 10 years years years semantic and episodic memories allows having a framework that links computational memory Figure 1. Distance metrics on the latent space. A. Cosine similarity between years with 3, 25, and 200 with the biological one. Its capacities and defects latent components. B. Euclidean distance between need to be explored. Curiously, the etymology of years with 3, 25, and 200 latent components. “topic” comes from the Greek topos or place, that as memory is other of the known hippocampus cognitive functions. Finally, from the results obtained is evident that sleep and memory are the most relevant words of the selected papers, these words and slow are the few words that are the result from queries too with RESCAL decomposition. The nsNMF decomposition gives for any number of components the same words, then it is more robust to the change in the number of components. Methods and Materials Data extraction The meta-data from 11 articles from different publishers (Table 4) related to “Stimulation during NREM sleep” in PDF files was obtained using the software CERMINE (Tkaczyk et al., 2015) and stored in JATS format. After, with a Python script, the own title, authors and abstract were extracted and also the title and authors of references inside the time range 2008-2018. Later, the titles and 1 https://jats.nlm.nih.gov/ Proceedings of the 4th Congress on Robotics and Neuroscience Table 4. Most probable word with NMF decomposition. Year Authors Title 2016 Batterink et. al Phase of Spontaneous Slow Oscillations during Sleep Influences Memory-Related Processing of Auditory Cues 2016 Weigenand et. al Timing matters: open-loop stimulation does not improve overnight consolidation of word pairs in humans 2017 Besedovsky et. al Auditory closed-loop stimulation of EEG slow oscillations strengthens sleep and signs of its immune-supportive function 2017 Lafon et. al Low frequency transcranial electrical stimulation does not entrain sleep rhythms measured by human intracranial recordings 2017 Leminen et. al Enhanced Memory Consolidation Via Automatic Sound Stimulation During Non-REM Sleep 2017 Lustenberger et. al High-density EEG characterization of brain responses to auditory rhythmic stimuli during wakefulness and NREM sleep 2017 Oyarzun et. al Targeted Memory Reactivation during Sleep Adaptively Promotes the Strengthening or Weakening of Overlapping Memories 2017 Kinzing et. al Odor cueing during slow-wave sleep benefits memory independently of low cholinergic tone 2018 Ashton et. al No effect of targeted memory reactivation during slow-wave sleep on emotional recognition memory 2018 Debellemaniere et. al Performance of an Ambulatory Dry-EEG Device for Auditory Closed-Loop Stimulation of Sleep Slow Oscillations in the Home Environment 2018 Ezzyat et. al Closed-loop stimulation of temporal cortex rescues functional networks and improves memory abstracts were tokenized and semantically tagged, using nltk library, to extract the adjectives and nouns that are considered the principal terms of the articles. For de-duplicating authors, all names are formatted to “(Last name) (First name initial.) (Middle name initial.)” For de-duplication of titles and words, all words were transformed to lowercase and special characters were eliminated. For each year, a zeros square matrix 𝑋𝑘 ∈ ℝ(𝑛𝑎 +𝑛𝑡 +𝑛𝑤 )×(𝑛𝑎 +𝑛𝑡 +𝑛𝑤 ) was populated with weighted and directed values using the next rules only in the 𝑘-year correspondent to relations: 𝐴𝑢𝑡ℎ𝑜𝑟𝑖 co-wrote with 𝐴𝑢𝑡ℎ𝑜𝑟𝑗 𝑥𝑘𝑖𝑗 + = 2 𝐴𝑢𝑡ℎ𝑜𝑟𝑖 cited 𝐴𝑢𝑡ℎ𝑜𝑟𝑗 𝑥𝑘𝑖,𝑗 + = 1 𝐴𝑢𝑡ℎ𝑜𝑟𝑖 wrote 𝐴𝑟𝑡𝑖𝑐𝑙𝑒𝑗 𝑥𝑘𝑖,𝑗 + = 2 𝐴𝑢𝑡ℎ𝑜𝑟𝑖 cited 𝐴𝑟𝑡𝑖𝑐𝑙𝑒𝑗 𝑥𝑘𝑖,𝑗 + = 1 𝐴𝑢𝑡ℎ𝑜𝑟𝑖 wrote 𝑊 𝑜𝑟𝑑𝑗 𝑥𝑘𝑖,𝑗 + = 1 𝐴𝑟𝑡𝑖𝑐𝑙𝑒𝑖 cited 𝐴𝑟𝑡𝑖𝑐𝑙𝑒𝑗 𝑥𝑘𝑖,𝑗 + = 1 𝐴𝑟𝑡𝑖𝑐𝑙𝑒𝑖 contained 𝑊 𝑜𝑟𝑑𝑗 𝑥𝑘𝑖,𝑗 + = 1 𝑊 𝑜𝑟𝑑𝑖 is in the same document of 𝑊 𝑜𝑟𝑑𝑗 𝑥𝑘𝑖,𝑗 + = 1 This approach for expressing the relations simplifies the tensor representation because the dimension correspondent to predicate are intrinsic on the weighted values and allows the use of RESCAL factorization. (Ma et al., 2018) explain other tensor decomposition methods that could be used to get the latent components. Cosine similarity The cosine similarity is an adequate distance metric for vectors where the magnitude is dependent on the size of the sample, as the frequency of words in a document. √ 𝑥⋅𝑦 𝑠= (2) ‖𝑥‖2 ‖𝑦‖2 Tensor memory hypothesis A fourth order tensor could be decomposed as 𝑋 ≈ 𝐺 ×1 𝐴𝑠 ×2 𝐴𝑝 ×3 𝐴𝑜 ×4 𝐴𝑡 (3) The probability of the existence of the relationship between the entities of a quad is given by 𝑃 ((𝑠, 𝑝, 𝑜, 𝑡)) = 𝑠𝑖𝑔(𝜃𝑠,𝑝,𝑜,𝑡 ), (4) Where 1 𝑠𝑖𝑔(𝑥) = (5) 1 + 𝑒−𝑥 Proceedings of the 4th Congress on Robotics and Neuroscience 𝜃𝑠,𝑝,𝑜,𝑡 = 𝑓 𝑒 (𝐚𝑒𝑠 , 𝐚𝑒𝑝 , 𝐚𝑒𝑜 , 𝐚𝑒𝑡 ), (6) ∑ 𝑟̄ ∑ 𝑟̄ ∑ 𝑟̄ ∑ 𝑟̄ 𝑓 𝑒 (𝐚𝑒𝑠 , 𝐚𝑒𝑝 , 𝐚𝑒𝑜 , 𝐚𝑒𝑡 ) = 𝑎𝑒𝑠 ,𝑟1 𝑎𝑒𝑝 ,𝑟2 𝑎𝑒𝑜 ,𝑟3 𝑎𝑒𝑡 ,𝑟4 𝑔 𝑒 (𝑟1 , 𝑟2 , 𝑟3 , 𝑟4 ). (7) 𝑟1 =1 𝑟2 =1 𝑟3 =1 𝑟4 =1 The analysis of tensors, as for matrices, is possible to perform using a reduced form obtained by factorization. One popular factorization method of tensors is the Tucker representation, however, there are other matrices and tensor decomposition algorithms. Here, I used RESCAL and the construction of the tensor with weighted values allow to omit the predicate dimension, then the characteristic function becomes ∑ 𝑟̄ ∑ 𝑟̄ ∑ 𝑟̄ 𝑓 𝑒 (𝐚𝑒𝑠 , 𝐚𝑒𝑜 , 𝐚𝑒𝑡 ) = 𝑎𝑒𝑠 ,𝑟1 𝑎𝑒𝑜 ,𝑟2 𝑎𝑒𝑡 ,𝑟3 𝑔 𝑒 (𝑟1 , 𝑟2 , 𝑟3 ). (8) 𝑟1 =1 𝑟2 =1 𝑟3 =1 RESCAL This tensor decomposition was proposed by (Nickel, 2013). The decomposed tensor needs to have two dimensions of the same size, i.e., 𝐗 ∈ ℝ𝑛×𝑛×𝑚 and the results are a matrix 𝐴 ∈ ℝ𝑛×𝑟 and a tensor 𝐑 ∈ ℝ𝑟×𝑟×𝑚 . 𝑋 ≈ 𝑅 ×1 𝐴 ×2 𝐴, (9) 𝑋𝑘 = 𝐴𝑟𝑘 𝐴𝑇 . The algorithm is an alternating least squares (ALS) procedure where the outputs are updated with: ( 𝑚 )( 𝑚 )−1 ∑ ∑ 𝐴← 𝑋𝑘 𝐴𝑅𝑇𝑘 + 𝑋𝑘𝑇 𝐴𝑅𝑘 𝑅𝑘 𝐴 𝑇 𝐴𝑅𝑇𝑘 + 𝑅𝑇𝑘 𝐴𝑇 𝐴𝑅𝑘 + 𝜆𝐴 𝐼 , (10) 𝑘=1 𝑘=1 ( ) 𝑅𝑘 ← 𝑉 𝑃 ∗ 𝑈 𝑇 𝑋𝑘 𝑈 𝑉 𝑇 , (11) Where 𝑅𝑘 is a slice of the tensor 𝐑 and for optimization a singular value decomposition of matrix 𝐴 is employed. 𝑃 is the matrix such that 𝑑𝑖𝑎𝑔(𝑣𝑒𝑐(𝑃 )) = 𝑆,̂ which can be constructed by rearranging the diagonal entries of 𝑆̂ via the inverse vectorization operator 𝑣𝑒𝑐𝑟−1 (⋅) 𝐴 = 𝑈 Σ𝑉 𝑇 , (12) Then, for regularization, the Kronecker product of the diagonal matrix is employed. 𝑆 =Σ⊗Σ (13) 𝑆𝑖𝑖 𝑆̂𝑖𝑖 = (14) 𝑆𝑖𝑖2 + 𝜆𝑅 Non-negative Matrix Factorization (NMF) This matrix factorization method finds two matrices 𝑊 ∈ ℝ𝑛×𝑟 and 𝐻 ∈ ℝ𝑟×𝑚 which multiplication minimizes the Froebenius norm with the original matrix 𝑋 ∈ ℝ𝑛×𝑚 . 𝑋 ≈ 𝑊 𝐻, (15) The updates using the algorithm proposed by (Lee and Seung, 2001) are: 𝑋𝐻 𝑇 𝑊 ←𝑊 , (16) 𝑊 𝐻𝐻 𝑇 + 𝛿 𝑊 𝑇𝑋 𝐻 ←𝐻 . (17) 𝑊 𝑇𝑊 𝐻 + 𝛿 Proceedings of the 4th Congress on Robotics and Neuroscience Non-smooth Non-negative Matrix Factorization (nsNMF) This decomposition is a modification of NMF proposed by (Kang and Lin, 2018). 𝑋 ≈ 𝑊 𝑆𝐻, (18) Where 𝜃 𝑇 𝑆 = (1 − 𝜃)𝐼 + 11 , (19) 𝑘 ( 𝑚 ) ∑ 𝐷= 𝐻𝑖,𝑗 𝐼, (20) 𝑗=1 And using 𝑊 = 𝑊ℎ 𝐷−1 𝑆 −1 , (21) Finally, the matrix decomposition could be expressed as 𝑋 ≈ 𝑊 𝐷−1 𝑆 −1 𝑆𝐷𝐻. (22) Funding This work was supported by Beca Doctorado Nacional Conicyt, Folio No 21180640. References Alshareef AM, Alhamid MF, El Saddik A. Recommending Scientific Collaboration Based on Topical, Authors and Venues Similarities. 2018 IEEE International Conference on Information Reuse and Integration (IRI). 2018; p. 55–61. https://ieeexplore.ieee.org/document/8424687/, doi: 10.1109/IRI.2018.00016. Bizer C, Heath T, Berners-Lee T. Linked data: The story so far. In: Semantic services, interoperability and web applications: emerging concepts IGI Global; 2011.p. 205–227. Eichenbaum H. Memory on time. Trends in Cognitive Sciences. 2014; 17(2):81–88. doi: 10.1016/j.tics.2012.12.007.Memory. Eichenbaum H. Time cells in the hippocampus: A new dimension for mapping memories. Nature Reviews Neuroscience. 2014; 15(11):732–744. doi: 10.1038/nrn3827. Griffiths TL, Steyvers M. Finding scientific topics. Proceedings of the National academy of Sciences. 2004; 101(suppl 1):5228–5235. Kang Y, Lin Kp. Topic Diffusion Discovery based on Sparseness-constrained Non-negative Matrix Factorization. . 2018; doi: 10.1109/IRI.2018.00021. Lee D, Seung H. Algorithms for non-negative matrix factorization. Advances in neural information processing sys- tems. 2001; (1):556–562. http://papers.nips.cc/paper/1861-algorithms-for-non-negative-matrix-factorization, doi: 10.1109/IJCNN.2008.4634046. Ma Y, Tresp V, Daxberger E. Embedding Models for Episodic Memory. . 2018 jun; http://arxiv.org/abs/1807.00228. Nickel M. Tensor Factorization for Relational Learning. . 2013; p. 161. http://nbn-resolving.de/urn:nbn:de:bvb: 19-160568. Tkaczyk D, Szostek P, Fedoryszak M, Dendek PJ, Bolikowski Ł. CERMINE: automatic extraction of structured metadata from scientific literature. International Journal on Document Analysis and Recognition (IJDAR). 2015; 18(4):317–335. Tresp V, Ma Y. The Tensor Memory Hypothesis. . 2017; http://arxiv.org/abs/1708.02918. Tresp V, Ma Y, Baier S, Yang Y. Embedding learning for declarative memories. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). 2017; 10249 LNCS:202–216. doi: 10.1007/978-3-319-58068-5_13. Wei T, Li M, Wu C, Yan XY, Fan Y, Di Z, Wu J. Do scientists trace hot topics? Scientific Reports. 2013; 3:3–7. doi: 10.1038/srep02207. Zeng A, Shen Z, Zhou J, Wu J, Fan Y, Wang Y, Stanley HE. The science of science: From the perspective of complex systems. Physics Reports. 2017; 714-715:1–73. https://doi.org/10.1016/j.physrep.2017.10.001, doi: 10.1016/j.physrep.2017.10.001.