=Paper= {{Paper |id=Vol-2577/paper11 |storemode=property |title=Determining the Directions of Links in Undirected Networks of Terms |pdfUrl=https://ceur-ws.org/Vol-2577/paper11.pdf |volume=Vol-2577 |authors=Dmytro Lande,Oleh Dmytrenko,Oksana Radziievska |dblpUrl=https://dblp.org/rec/conf/its2/LandeDR19 }} ==Determining the Directions of Links in Undirected Networks of Terms== https://ceur-ws.org/Vol-2577/paper11.pdf

132

Determining the Directions of Links in Undirected
Networks of Terms

© Dmytro Lande 1,2,3[0000-0003-3945-1178] © Oleh Dmytrenko 1[0000-0001-8501-5313]
© Oksana Radziievska3[0000-0003-3813-3987]
1 Institute for Information Recording of NAS of Ukraine, Kyiv, Ukraine
2 National Technical University “Igor Sikorsky Kyiv Polytechnic Institute”, Kyiv, Ukraine
3 Scientific Research Institute for Informatics and Law of National Academy of Legal Sciences

of Ukraine, Kyiv, Ukraine
dwlande@gmail.com dmytrenko.o@gmail.com radeoksa@gmail.com

Abstract. This paper examines and analyzes approaches for constructing net-
work of terms as an ontological subject domain model. In particular, new ap-
proaches and rules for determining the syntax and semantic links between terms
in the text and the directions of these links between nodes in undirected networks
of terms constructed from terms of a thematic text corpus, are proposed and re-
searched. Also, one of the methods for creating terminological ontologies – the
algorithm for building the thematic networks of natural hierarchies of terms based
on analysis of texts corpora – is considered and used to build a directed network
of words and phrases (separate unigrams, bigrams and threegrams). The well-
known fairy tale “The story of Little Red Riding Hood” is provided as examples
to demonstrate an accuracy of the proposed rules. The Python programming lan-
guage and its separate functions of a specialized add-in - the module NLTK (Nat-
ural Language Toolkit open source library) is used to create the software realiza-
tion of the proposed and considered approaches and methods. Using the software
for modelling and visualization of graphs - Gephi, the built directed networks of
terms were visualized for better visual perception. The proposed approach can be
used for automatically creating terminological ontologies of subject domains
with the participation of experts. Also, the research result can be used to create
personal search interfaces for users of information retrieval systems and also can
be used in navigation systems in databases. It should help users of such systems
simplify the process of searching the relevant information.

Keywords: Subject Domain, Terminological Ontology, Network of Terms, Hor-
izontal Visibility Graph, Network of Natural Hierarchies of Terms, Syntax and
Semantic Links, Undirected Network, Directed Network.

1 Formulation of the Problem

The development of computer technologies and, in particular, the Internet as the source
of information resources and a dynamic source of texts, opens new opportunities to
develop and apply the improved methods of their research. There are different methods,
Copyright © 2019 for this paper by its authors. Use permitted under Creative Commons License
Attribution 4.0 International (CC BY 4.0).
133

methodologies and techniques of computerized text processing and analysis. The mod-
ern software is increasingly in need of ready-made solutions to improve its systems.
It should be noted that it is very important to formalize the knowledge of some sub-
ject domain while its studying. This process of representing, formal naming and defi-
nition of the categories, properties and relations between the concepts, data and entities
is known as ontology modeling of the subject domain. A network of terms can be con-
sidered as a model of some subject domain. In this network of terms, nodes correspond
to the individual words and phrases in the text and the edges to the links between them.
The process of ontology creating is usually very complex and resource-intensive, and
besides this, it is still an unsolved scientific and practical problem [1]. A separate step
in this formalization is to identify the basic objects. In the case of networks of terms
building, this step includes creation dictionaries, thesauruses, and subject dictionaries
of terms, which based on the text corpus. The task of effective selection of individual
terms from the text corpus and automating such selection is still open, important and
completely unresolved [2, 3].
Due to the complexity of natural language, the determination of the syntax and se-
mantic links between nodes that correspond to the terms in the text and the determina-
tion of the directions of these links is also an equally complex and open problem of
conceptualization.
The purpose of this work is to propose and present new approaches for determining
the directions of links between nodes in undirected networks of terms built from words
and phrases (separate unigrams, bigrams and trigrams) of a thematic text corpus.

2 Method for Building Undirected Networks of Terms

There are several approaches for transforming the texts into a network of terms and
different ways to interpret nodes and connections [4, 5]. It leads to different kinds of
presentation of these networks [6].
In this work, the compactified horizontal visibility graph (CHVG) algorithm for cre-
ating terminological ontologies of subject domains for key terms (separate unigrams,
bigrams and trigrams) is used.

2.1 Compactified Horizontal Visibility Graph (CHVG) Algorithm
The horizontal visibility graph (HVG) algorithm [7, 8, 9] is a modification of a common
visibility algorithm [10].
In the work [11], the next steps are proposed to build undirected networks of terms
using the HVG algorithm. The first step is to mark on the horizontal axis a number of
nodes, each of which corresponds to the terms in the order in which they occur in the
text; and the weighted values – numerical estimates xi that is intended to reflect how
important a word is to a document in a collection or corpus are marked on the vertical
axis. In the second stage, the horizontal visibility graph is built.
Two nodes ti and tj corresponding to the elements of the time series xi and xj, are is
connected in a HVG if and only if, when xk < min(xi; xj) for all tk (ti < tk < tj).
134

In the third stage, the network that obtained on in the previous steps is compactified:
the nodes that correspond to the same terms are combined into a single node. The ob-
tained undirected network of terms is called the compactified horizontal visibility graph
(CHVG) (see fig. 1).

Рис. 1. Stages of a building of the compactified horizontal visibility graph [11].

Thus, the CHVG algorithm allows building an undirected network of terms in case,
when the numerical values are assigned to separated words or phrases (separate uni-
grams, bigrams and trigrams) of a thematic text corpus.

2.2 Text Corpora Pre-processing
Languages we speak and write are made up of several words often derived from one
another.
When a language contains words that are derived from another word as their use in
the speech changes is called Inflected Language. It is clear to understand that an in-
flected word(s) will have a common root form.
In this section, we briefly describe the main parts of processing text documents such
as tokenization, part-of-speech tagging, lemmatization, stop words removal, stemming
process and terms weighting.

Tokenization and lemmatization
For preliminary lexical analysis, breaking text up into its single words (tokens) –
tokenization, is made.
135

Lemmatization usually refers to doing things properly with the use of vocabulary and
morphological analysis of words, normally aiming to remove inflectional endings only
and to return the base or dictionary form of a word, which is known as the lemma. A
lemma (plural lemmas or lemmata) is the canonical form, dictionary form, or citation
form of a set of words.
For example, "runs", "running", "ran" are all forms of the word "run", therefore "run"
is the lemma of all these words. Because lemmatization returns an actual word of the
language, it is used where it is necessary to get valid words.
In this work, “WordNet Lemmatizer” provided by Python NLTK was used to lem-
matize the tokens. “WordNet Lemmatizer” uses the WordNet Database to lookup lem-
mas of words.
Tokenization and lemmatization are usually the initial stages of word processing be-
cause they allow you to work with a word as a single entity, knowing its context [12].

Part-of-Speech Tagging
POS tagging is one of the first steps in computer text analysis.
Before lemmatization, it is necessary to provide the context in which you want to
lemmatize that is the parts-of-speech (POS) [13].
In corpus linguistics, part-of-speech tagging (POS tagging or PoS tagging or POST),
also called grammatical tagging or word-category disambiguation, is the process of
marking up a word in a text (corpus) as corresponding to a particular part of speech,
based on both its definition and its context—i.e., its relationship with adjacent and re-
lated words in a phrase, sentence, or paragraph. A simplified form of this is commonly
taught to school-age children, in the identification of words as nouns, verbs, adjectives,
adverbs, etc.
In general, PoS tagging algorithms are divided into two distinct groups: rule-based
and stochastic. E. Brill's tagging method [14], which uses rule-based algorithms, is the
first and most widely used method of tagging English-language texts.
“Part of Speech tagging” is one of the more powerful aspects of the NLTK module
in the Python programming language. Basically, the goal of a POS tagger is to assign
linguistic (mostly grammatical) information to sub-sentential units - tokens.

Stop Words Removal
Also, after the stage of pre-procession of the textual documents and the extraction of
key terms in this study, it is proposed to remove stop words that have no semantic
strength, that is, informationally unimportant ones, as well as bigrams containing at
least one stop word and trigrams that start or end with a stop word. In general, stop
words are words that do not contain important significance to be used in Search Que-
ries. Usually, these words are filtered out from search queries because they return a vast
amount of unnecessary information. Mostly they are words that are commonly used in
the English language such as 'as, the, be, are' etc.
The stop dictionary used in this work was based on different stop dictionaries, which
are available at:
https://code.google.com/archive/p/stop-words/downloads/;
http://www.textfixer.com/tutorials/common-english-words.php.
136

It should be noted, that each programming language will give its list of stop words
to use. In this work, the “SnowballStemmer” (stemmer that is realized in Python in
NLTK librаry – Natural Language Toolkit librаry) was also used to ignore stop words.
Also, the formed stop dictionary was expanded by adding other stop words that were
identified by experts within the considered subject domain.

Stemming
After the stages described above, for combining the words that have a common root
into a single word it is proposed to carry out the process of stemming. Stemming is the
process of reducing inflection in words to their root forms such as mapping a group of
words to the same stem even if the stem itself is not a valid word in the Language [15].
Stemming usually refers to a crude heuristic process that chops off the ends of words
and often includes the removal of derivational affixes that are used with a word. So
words having the same stem will have a similar meaning. The results of stemming are
similar to determining the root of the word, but its algorithms are based on other prin-
ciples [16]. That is why, after stemming (processing with stemmer), the word may be
different from its morphological root.
The goal of both stemming and lemmatization is to reduce inflectional forms and
sometimes derivationally related forms of a word to a common base form.
However, the two processes differ in that stemming most commonly collapses deri-
vationally related words, whereas lemmatization commonly only collapses the different
inflectional forms of a lemma.
If confronted with the token "saw", stemming might return just s, whereas lem-
matization would attempt to return either "see" or "saw" depending on whether the use
of the token was as a verb or a noun.
To avoid the confusion described above, in this work the lemmatization process pre-
cedes the stemming process.
Several stemming algorithms can be distinguished in terms of performance, accu-
racy, and how stemming problems are overcome [17].
The most common algorithm for stemming English, and one that has repeatedly been
shown to be empirically very effective, is Porter's algorithm [18, 19]. In this work, the
“PorterStemmer” stemmer realized in Python in NLTK (Natural Language Toolkit)
librаry was used. This function is known for its simplicity and speed. As a result of its
use, words having the same stem will have a similar meaning.
The pre-processing stages described above allows normalizing the text corpus.

2.3 Weighting and extraction of the key terms
After the pre-processing stages, the weighting and extraction of the key terms are made.
To form a time series, the function that reflects the term to number, this study uses the
modification of classic statistical weight indicator TF-IDF (from English, TF is Term
Frequency, IDF is Inverse Document Frequency) [20, 21] – GTF (Global Term Fre-
quency) [22] as a weight value of terms.
This approach allows having a high statistical indicator of importance for informa-
tionally-important in global context elements of the text.
137

3 Rules for Determining the Directions of Links

As was mentioned above, the determination of the directions of links is a complex and
open problem of ontology creation. Below, we consider several new approaches for
determining the directions of links between nodes in undirected networks of terms built
from words and phrases (separate unigrams, bigrams and trigrams) of a thematic text
corpus.
Let G be the undirected network of terms that built according to the described above
rules: G:= (V, T) where V is the set of nodes, T is the set of the unordered pairs of nodes
from the set V that correspond to the causal links between the nodes.
It is supposed that a causal link exists in the direction from the node ti to the node tj
for ∀ , : (ti, tj) ∈ T if:
1. the numerical value of the node ti that corresponds to: a) degree [23, 24] b) HITS
score [25] c) PageRank score [26]) is higher than the numerical value of the corre-
sponded score of the node tj;
2. within the sentence, the term to which the node ti corresponds precedes the term
to which the node tj corresponds;
3. the term to which the node ti corresponds is shorter than the term to which the
node tj corresponds.
One of the methods for creating terminological ontologies – the algorithm for build-
ing the thematic networks of natural hierarchies of terms based on analysis of texts
corpora – is used to build the directed network of words and phrases (separate unigrams,
bigrams and trigrams) according to the third rule. The work [27] notes that the algo-
rithm for building the networks of natural hierarchies of terms provides for the building
of a compactified horizontal visibility graph and the determining of directions of links
between the key terms according to the rule: a word is a part of a two-term phrase or a
three-term phrase and the two-term phrase is a part of the three-term phrase.

4 Results of the Study of the Proposed Approaches

The proposed approaches for determining the directions of links in undirected networks
of terms was tested on the example of the English-language text, namely – the well-
known fairy tale “The story of Little Red Riding Hood”.
According to the described above method, the text pre-processing and the extraction
of the key terms (separate unigrams, bigrams and trigrams) were made (Table 1, 2 and
3).
138

Table 1. Top 16 key unigrams and their degree, HITS and PageRank for the text “The story of
Little Red Riding Hood”.

Unigrams GTF Degree HITS PageRank
grandmoth 0.065 49 0.444 0.0545
red 0.059 32 0.3099 0.036
hood 0.053 22 0.256 0.0252
ride 0.053 2 0.051 0.0029
wolf 0.031 30 0.301 0.0327
wood 0.025 17 0.204 0.0169
bed 0.019 18 0.191 0.0186
open 0.016 13 0.19 0.0148
beauti 0.016 15 0.155 0.0154
big 0.012 9 0.088 0.0119
cap 0.012 12 0.162 0.0141
cake 0.012 9 0.101 0.0116
cut 0.009 10 0.095 0.0127
strang 0.009 13 0.116 0.0167
ate 0.009 7 0.134 0.0081
huntsman 0.009 11 0.113 0.0148

Table 2. Top 15 key bigrams and their degree, HITS and PageRank for the text “The story of
Little Red Riding Hood”.

Bigrams GTF Degree HITS PageRank
ride_hood 0.053 26 0.465 0.0277
red_ride 0.053 28 0.494 0.0303
grandmoth_big 0.009 11 0.177 0.0095
hood_grandmoth 0.006 4 0.107 0.0047
leav_path 0.006 7 0.162 0.0084
grandmoth_live 0.006 6 0.164 0.0071
wolf_bodi 0.006 7 0.160 0.0080
grandmoth_bed 0.006 5 0.04 0.0060
straight_grandmoth 0.006 7 0.133 0.0076
beauti_wood 0.006 6 0.133 0.0070
cake_wine 0.006 8 0.172 0.0084
wood_wolf 0.006 7 0.193 0.0076
press_latch 0.006 8 0.047 0.0064
grandmoth_sick 0.006 5 0.096 0.0058
door_open 0.006 8 0.109 0.0085
139

Table 3. Top 26 key trigrams and their degree, HITS and PageRank for the text “The story of
Little Red Riding Hood”.

Trigrams GTF Degree HITS PageRank
red_ride_hood 0.1429 36 0.67 0.1042
grandmoth_what_big 0.0252 6 0.145 0.0114
press_the_latch 0.0168 6 0.059 0.0111
leav_the_path 0.0168 4 0.153 0.0126
sick_and_weak 0.0168 6 0.182 0.0179
bed_and_pull 0.0168 7 0.162 0.0188
cake_and_wine 0.0168 5 0.160 0.0133
hear_how_beauti 0.0084 2 0.111 0.0076
look_so_strang 0.0084 2 0.03 0.0071
hood_and_ate 0.0084 2 0.111 0.0079
listen_littl_red 0.0084 2 0.129 0.007
bite_he_climb 0.0084 2 0.001 0.0101
obey_her_mother 0.0084 2 0.021 0.0084
lay_the_wolf 0.0084 2 0 0.0105
mind_your_manner 0.0084 2 0.054 0.0068
open_hi_belli 0.0084 2 0.003 0.0096
cake_and_drank 0.0084 2 0.018 0.0090
snore_veri_loudli 0.0084 2 0 0.0105
strang_oh_grandmoth 0.0084 2 0.028 0.0059
bird_are_sing 0.0084 2 0.021 0.0084
bed_fell_asleep 0.0084 2 0 0.0103
ride_hood_enter 0.0084 2 0.114 0.0074
larg_heavi_stone 0.0084 2 0.004 0.0093
woman_wa_snore 0.0084 2 0.111 0.0076
loudli_a_huntsman 0.0084 2 0 0.0105
red_ride_hood 0.0084 2 0 0.1042

The following results were obtained after building the directed network according to
the first rule for different measures of network nodes (for the degree – fig. 2; for the
HITS – fig. 3; for the PageRank – fig. 4). Using the software for modeling and visual-
ization of graphs – Gephi (https://gephi.org), the built directed networks of terms were
visualized for better visual perception.
140

Fig. 2. Fragment of the directed network, which built according to the first rule for node de-
gree.

Fig. 3. Fragment of the directed network, which built according to the first rule for HITS.
141

Fig. 4. Fragment of the directed network, which built according to the first rule for PageRank.

Fig. 5 shows the directed network of terms built according to the second rule.

Fig. 5. Fragment of the directed network, which built according to the second rule.

Fig. 6 shows the network of natural hierarchies of terms, which built according to
the third rule.
142

Fig. 6. Fragment of the network of natural hierarchies of terms.

After analyzing the obtained results, it was found that the directed network, which
built according to the second rule more precisely reflects the directions of links that
exist between the terms in the considered text, than the network, which built according
to the first rule. The network of natural hierarchies of terms has its peculiarities and
advantages, so it is difficult to compare it with the networks built according to the first
two rules. Taking account into the naturalness of links that determined in such a net-
work, we can talk about their syntactic adequacy.
Considering, for example, the directions of links determined for key terms, we can
see that according to the first rule, the links between “wolf”-“grandmother”-“red” are
as follows (see fig. 2,3,4): for the degree, HITS and PageRank – the “grandmother”
influences on the “wolf” and the “red”, and the “red” influences on “wolf”. It does not
correspond to the real directions of links that exist in the text in terms of content anal-
ysis. While, according to the second rule, the “wolf” influences on the “grandmother”
and the “grandmother” influences on the “red”, which corresponds to the content of the
considered text.
In comparison with other rules, the rule for determining the directions of links in
t
undirected networks of terms, when within the sentence, the term to which the node i
t t t ,t
corresponds precedes the term to which the node j corresponds (where j ( i j ) ∈ T)
is more informative among the first two rules. It is because the links determined ac-
cording to this rule more precisely corresponds to the content of the considered text
according to experts.
143

5 Conclusion

After studying the proposed rules for determining the directions of links in undirected
networks of terms, it was found that the second rule more precisely reflects the direc-
tions of links, which correspond to the content of the considered text according to ex-
perts and is more informative. On the example of the English-language text – the well-
known fairy tale “The story of Little Red Riding Hood” the undirected network of terms
was built. Using the proposed rules for determining the directions of links, the directed
networks of terms were obtained from undirected networks of terms. Informative con-
tent of network links built according to the second proposed rule is higher among the
other two rules according to experts. Taking account into the naturalness of links that
determined in the network of natural hierarchies of terms, we can talk about their syn-
tactic adequacy.
The directed networks of words and phrases built according to the proposed ap-
proach can be used for automatically creating terminological ontologies of subject do-
mains with the participation of experts. Also, the research result can be used to create
personal search interfaces for users of information retrieval systems and also can be
used in navigation systems in databases. It should help users of such systems simplify
the process of searching the relevant information.
As the task of improving the accuracy of determining the directions of links between
nodes in undirected networks of words and phrases is actual, then it is planned to con-
tinue working in this direction, developing new and modifying existing approaches.

References
1. Lande, D., Snarsky, A.: Approach to Creation of Terminological Ontologies. Design ontol-
ogy 2(12), pp. 83-91, (2014). (in Russian)
2. Lukashevich, N., Dobrov, B., Chuiko, D.: Selection of Word Combinations for Automatic
Word Processing System Dictionary. Computational Linguistics and Intellectual Technolo-
gies: Proceedings of the International Conference «Dialogue–2008», pp. 339–344. Moscow
(2008). (in Russian)
3. Filippovich, Yu., Prokhorov, A.: Semantics of Information Technologies: Experiments of
Dictionary-thesaurus Description. Moscow State University of Printing Arts, Moscow
(2002). (in Russian)
4. Ferrer-i-Cancho, R., & Solé, R.: The Small World of Human Language. in Proc. of the Royal
Society of London, pp. 2261-2265. London (2001).
doi: 10.1098/rspb.2001.1800.
5. Caldeira, S. M. G., Petit Lobao, T. C., Andrade, R. F. S., Neme, A., & Miranda, J. G. V.:
The network of concepts in written texts. The European Physical Journal B-Condensed Mat-
ter and Complex Systems 49(4), 523-529 (2005).
6. Ferrer-i-Cancho, R. F., Solé, R. V., & Köhler, R.: Patterns in syntactic dependency networks.
Physical Review E 69(5), (2004).
doi: 10.1103/PhysRevE.69.051915
7. Luque, B., Lacasa, L., Ballesteros, F., & Luque, J.: Horizontal visibility graphs: Exact results
for random time series. Physical Review E, 80(4), (2009).
doi: 10.1103/PhysRevE.80.046103.
144

8. Gutin, G., Mansour, T., & Severini, S.: A characterization of horizontal visibility graphs and
combinatorics on words. Physica A: Statistical Mechanics and its Applications, 390(12),
2421-2428 (2011).
doi: 10.1016/j.physa.2011.02.031.
9. Bezsudnov, I. V., & Snarskii, A. A.: From the time series to the complex networks: The
parametric natural visibility graph. Physica A: Statistical Mechanics and its Applications,
414, 53-60 (2014).
doi: 10.1016/j.physa.2014.07.002.
10. Lacasa, L., Luque, B., Ballesteros, F., Luque, J., & Nuno, J. C.: From time series to complex
networks: The visibility graph. Proceedings of the National Academy of Sciences, 105(13),
4972-4975 (2008).
doi: 10.1073/pnas.0709247105
11. Lande, D. V., Snarskii, A. A., Yagunova, E. V., & Pronoza, E. V.: The use of horizontal
visibility graphs to identify the words that define the informational structure of a text. In:
2013 12th Mexican International Conference on Artificial Intelligence, pp. 209-215 (2013).
12. Manning, C. D., Raghavan, P., & Schütze, H.: An Introduction to Information Retrieval.
Cambridge University Press, 22–36 (2009).
13. Schmid, H.: Probabilistic Part-of-Speech Tagging Using Decision Trees. In: Proceedings of
International Conference on New Methods in Language Processing, pp. 1–9. Manchester,
UK (1994).
14. Brill, E.: A simple rule-based part of speech tagger. In: Proceedings of the third conference
on Applied natural language processing (ANLC '92). Association for Computational Lin-
guistics, pp. 152-155. Stroudsburg. PA. USA (1992).
doi:10.3115/974499.974526
15. Jongejan, B., & Dalianis, H.: Automatic training of lemmatization rules that handle morpho-
logical changes in pre-, in-and suffixes alike. In Proceedings of the Joint Conference of the
47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural
Language Processing of the Asian Federation of Natural Language Processing, pp. 145-153.
Association for Computational Linguistics, Singapore (2009)
16. Lovins, J. B.: Development of a stemming algorithm. Mech. Translat. & Comp. Linguistics
11(1-2), 22-31 (1968).
17. Baeza-Yates, R., & Ribeiro-Neto, B.: Modern information retrieval. New York: ACM Press,
Harlow. England: Addison-Wesle. (2011).
18. Porter, M. F.: An algorithm for suffix stripping. Program 14(3), 130-137 (1980).
doi: 10.1108/eb046814
19. Willett, P.: The Porter stemming algorithm: then and now. Program 40(3), 219-223 (2006).
doi: 10.1108/00330330610681295.
20. Salton, G., & Buckley, C.: Term-weighting approaches in automatic text retrieval. Infor-
mation processing & management 24(5), 513-523 (1988).
doi:10.1016/0306-4573(88)90021-0
21. Rajaraman, A., & Ullman, J. D. Mining of massive datasets. Cambridge University Press
(2011).
22. Lande, D.V., Dmytrenko, O.O., & Snarskii A.A.: Transformation texts into the complex
network with applying visibility graphs algorithms. In: CEUR Workshop Proceedings (ceur-
ws.org). Vol-2318 urn:nbn:de:0074-2318-4. Selected Papers of the XVIII International Sci-
entific and Practical Conference on Information Technologies and Security (ITS 2018). vol.
2318. pp. 95-106. (2018).
23. Bondy, J. A., & Murty, U. S. R.: Graph theory with applications. vol. 290. Macmillan, Lon-
don (1976).
145

24. Godsil, C., & Royle, G.: Algebraic Graph Theory. Graduate Texts in Mathematics 207.
Springer, New York (2001).
doi: 10.1007/978-1-4613-0163-9
25. Kleinberg, J. M.: Authoritative sources in a hyperlinked environment. In Processing of
ACM-SIAM Symposium on Discrete Algorithms, 46(5), pp. 604–632 (1998).
26. Brin, S., & Page, L.: The anatomy of a large-scale hypertextual web search engine. Com-
puter networks and ISDN systems, 30(1-7), 107-117 (1998).
doi:10.1016/S0169-7552(98)00110-X
27. Lande, D.V.: Building of networks of natural hierarchies of terms based on analysis of texts
corpora. arXiv preprint arXiv:1405.6068 (2014).