=Paper=
{{Paper
|id=Vol-3126/paper12
|storemode=property
|title=The information and analytical using of non-structured information resources
|pdfUrl=https://ceur-ws.org/Vol-3126/paper12.pdf
|volume=Vol-3126
|authors=Serhii Lienkov,Viacheslav Podlipaiev,Igor Tolok,Igor Lisitsky,Oleksii Fedchenko,Nataliia Lytvynenko,Svitlana Kuznichenko
}}
==The information and analytical using of non-structured information resources==
The Information and Analytical Using of Non-Structured
Information Resources
Serhii Lienkov1, Viacheslav Podlipaiev2, Igor Tolok3, Igor Lisitsky4, Oleksii Fedchenko5,
Nataliia Lytvynenko6 and Svitlana Kuznichenko7
1,3,5,6
Military Institute of Taras Shevchenko National University of Kyiv, Lomonosova Str., 81, Kyiv, 03189, Ukraine
2
Research Institute of Geodesy and Cartography, Velyka Vasyl’kivs’ka Str., Kyiv, 03150, Ukraine
4
Central Research Institute of Armament and Military Equipment of the Armed Forces of Ukraine, Povitroflots’kyy
Avenue, Kyiv, 03049, Ukraine
7
Dept.of Information Technologies Odessa State Environmental University, Odessa, Ukraine
Abstract
Following research article describes the conditions for the formation of interactive knowledge
bases, that are based on the formation of growing pyramidal networks in the analysis of textual
narratives. The stability conditions of knowledge systems on the basis of their representation in
the format of logical-linguistic models are determined. The authors also determined the
conditions of atypical representation of linguistic constructs knowledge in the process of their
transformation into a system. The use of lambda-calculus notation for the formation of stable
logical-linguistic models of narrative descriptions is proposed.
Keywords 1
logical-linguistic model, growing pyramidal networks, concepts, linguistic constructs, term,
knowledge, narrative.
1. Introduction these digital images don’t have interactive
services. Therefore, it’s quite important to create
intelligent services that can turn these texts into
The use of modern information in the activities
structurally organized knowledge bases.
of various specialists today is quite deep
There is already the problem of using a large
interdisciplinary. Moreover, the use of various
number of narratives, which should sufficiently
information resources in solving applied problems
expand intertextual connections. It allows to
requires the availability of service-developed
create a digital image of knowledge systems used
interactive knowledge bases. And the
in a single display format.
effectiveness of their use depends on the truth of
The first stage of the process of transforming
the content, which is determined by the
narrative descriptions into the format of
information component.
interactive knowledge bases that are able to
The practical main part of productive
interact with each other is the formation of
knowledge today is concentrated in the form of
logical-linguistic models of text descriptions.
text descriptions. At best, these narratives have
their digital image in the form of their presentation
in the formats of various editors and means of
displaying texts in computer systems. However,
ISIT 2021: II International Scientific and Practical Conference
«Intellectual Systems and Information Technologies», September
13–19, 2021, Odesa, Ukraine EMAIL: lenkov_s@ukr.net (1);
pva_hvu@ukr.net (2); igortolok@72gmail.com (3),
igor.lisitsky@gmail.com (4), a_fedchenko@ecomm.kiev.ua (5),
n123n@ukr.net (6), skuznichenko@gmail.com (7)
ORCID: 0000-0001-7689-239X (1); 0000-0002-7264-0520 (2);
0000-0001-6309-9608 (3); 0000-0002-1505-199X (4); 0000-
0003-1343-3828 (5); 0000-0002-2203-2746 (6); 0000-0001-
7982-1298 (7)
©️ 2021 Copyright for this paper by its authors. Use permitted under Creative
Commons License Attribution 4.0 International (CC BY 4.0).
CEUR Workshop Proceedings (CEUR-WS.org)
2. Research results and analysis x, y, z,... and the classes they form with letters
2.1. The constructive of logical- X ,Y , Z ,... and so on. The presence of certain
contexts in SSFL-concepts will be represented
linguistic models formation
according to the notation of -calculus (lambda-
The information base of any interactive calculus), namely - X [3]. The bracket is
knowledge system consists of different data types called “context holes”. It’s clear that the presence
[1,2]. These data have certain functional of the hole determines that the concepts aren’t
properties and form a rather complex structure of connected. Once we determine the term that can
interdependent relations. Moreover, the very fill the hole, we get the connected SSFL terms.
information base of systems of this class is dual in Then all classes formed by SSFL concepts are
nature - the data that make it up have certain extensional [3]. We’ll define properties of SSFL-
logical relationships on the one hand, and also concepts by the letter r , and set of properties
some of them are certain concepts and linguistic through R .
constructs (hereinafter concepts) on the other The hierarchical structures formed from SSFL
hand, so data have linguistic attributes [3]. The in the form of GPN are marked trees. Their labels
functionality of these data is displayed in the form are SSFL concepts, that are class names, and
of symbolic and numerical formulas, and we SSFL-concepts, which aren’t extensional, that is
present certain sequences of computational have only one semantic meaning. SSFL-concepts
operations [1-3]. The linguistic structures of these that have only one meaning can’t be reduced, that
data are presented in the form of a sequence of is broken down into simpler concepts. Such SSFL
certain words in the form of sentences, statements, concepts will be defined in the future as terminal
etc. [2]. [4, 5].
However, it should be noted that everything All SSFL-concepts form a certain set of names
related to the data will be presented through the , that are labels of all GPN nodes. Under such
concept of the term [3]. It follows that each conditions, GPN is unique to the set of Bohm trees
sequence of symbols of finite length (SSFL), [1-3]. That is, the topology of the interaction of
including numbers, as well as their representation SSFL sets concepts can be represented as a set of
in the form of formulas, can be considered as a - labeled trees formed by GPN nodes.
rule and can also be represented as a term. From X 1 , X 2 ,..., X n , a1 , a2 ,..., am , (1)
these formulas-rules it’s possible to form in the where X i – class of SSFL-concepts, a j –
future certain linguistic structures of the formal
kind that are displayed according to the syntax terminal node (the non-extensional SSFL-
defined for them. concept).
Further we will consider the final sequences of Having determined the property classes
characters that are plural in nature, that is, they R1, R2,..., Rm , that implement the division of all
can be combined into plurals on certain grounds. GPN concepts into classes, and determine the
Moreover, these sets can be represented as relationship between the concepts, we obtain the
hierarchically related classes. Each such class corresponding GPN. According to [4-6], each
includes sequences that have at least one common GPN is a taxonomy.
property [1, 3]. Such classes of SSFLs with Based on the condition formulated at the
properties form the certain topology, and beginning, namely that an arbitrary type of SSFL
therefore they can be represented as trees [2, 3]. is a term, it can be argued that all names of SSFL-
One of such tree types is a growing pyramidal concepts can form the set of terms , that’s
network (GPN) [4, 5]. Their attractiveness is the represented in the notation of lambda calculus [3].
ability to automatically divide the SSFL into This allows us to consider all SSFL-concepts and
appropriate classes based on the specified their meanings nominally. This condition is met
properties of each SSFL. on the basis that all the SSFL-concepts presented
The condition that SSFLs are divided into in expression (1) aren’t related by a strict ordering
classes according to certain properties defines relationship. Moreover, when we move on to the
them as intentional [2], that is those that have GPN, it’s always possible to distinguish many sets
signs-meanings, that we will define as the of SSFL-concepts, that also aren’t related to the
contexts of SSFLs. Then SSFLs that have a relationship of strict ordering.
defined non-empty set of contexts will be defined
as concepts and denoted by the variables
We’ll note also one more constructive property relation over certain sets of -terms, that leads to
of GPN. Nodes that are hierarchically the loss of the nominal value of their terms. It
interconnected can form truth statements that can gives the calculation of the contextual meanings
be calculated. Thus, based on the construction of of the terms semantic nature and thus implements
the GPN from SSFL-concepts, a certain system of an interactive act of interaction with the
knowledge in terms of -terms is formed. Its information base.
information base consists of certain linguistic X 1 , X 2 ,..., X n , a1 , a2 ,..., am
structures formed from SSFL-concepts, that are
(2)
X 1 , X 2 ,..., X n , a1 , a2 ,..., am ;
terms. The values of these terms required for
calculations are determined in the process of
assigning them the appropriate contexts. This X 1 , X 2 ,..., X n
X 1 B , X 2 D ,..., X n V , P ;
process is interactive. According to [3], each term (3)
representing the certain SSFL-concept will be
represented in the form of the Bohm tree of the U x1 , x2 ,..., xn , a1 , a2 ,..., am , (4)
form (1). Then we can say the following - there is where - the smallest element of all SSFL-
a meta-procedure that can turn the whole set of context values; B, D,V , P - context values.
linguistic constructs into GPN, which is a Expressions (2) - (4) reflect the generalized
composition of Bohm trees, that in turn is also a metaprocedure of IKB formation on the basis of
composition of many -terms, formed by SSFL- definition of context values of SSFL-concepts and
concepts of the same GPN. Therefore, in fact, the their transformation.
set of -terms can be represented as a certain The introduction of the smallest value of the
interactive knowledge base (IKB). context and the definition of the contexts
It’s clear that both functional data and themselves passively determines the order
linguistic structures that make up an interactive relation over the set of -terms, and thus creates
system of knowledge, that we present in the form the conditions for the formation of the GPN .
of a set of -terms, have certain relationships That is, expressions (2) - (4) are recursive.
with each other, that is in a certain way logically It can then be argued that an arbitrary LLM has
and functionally characterize each other. a nonempty structure of relationships between
Therefore, it’s most effective for further SSFL-concepts, which has a hierarchical form and
consideration of the information base of arbitrary can be represented as a tree. LLM is also an open
IKB to present in aggregate form, which is structure. This means that the information base,
implemented in the form of the logical-linguistic the logical and linguistic characteristics of which
models (LLM) class. This class of models is it represents, can be supplemented at any time
implemented on the basis of predicative with the latest concepts and their relationships.
representation of information structures of The open nature of LLM determines that this class
arbitrary type [7-15]. This allows us to consider of models has the property of inductance. That is,
them together in an arbitrary sequence without their graph model in the form of a tree can grow
defining the relationship strictly and not strictly. due to the latest concepts and their relationships.
Also, all LLM objects are atypical. This One of the effective types of graph models of
atypicality provides the definition of procedures LLM is a growing pyramidal network (GPN) [4,
that can jointly process the entire complex data 5]. Their positive distinguishing feature is the fact
structure that make up the information base of that an arbitrary GPN is equivalent to an arbitrary
interactive knowledge systems. Then the whole taxonomy of narrative description [1, 2, 6].
set of such data will be defined as a separate class The attributes of the concepts that make up the
of atypical data, that allows to interpret as nominal GPN nodes can be contexts that describe their
[3, 4]. semantics; belonging to a certain thematic class,
The predicativeness of the linguistic constructs that is determined by their semantics; relations
of IKB, as the composition of Bohm trees, between concepts, etc. That is, the inductive
determines the nature of the formation of process of forming the new nodes of the GPN can
statements from the nodes of these trees. be represented as a sequence of statements that are
Moreover, the formation of GPN as the formed on the basis of the contexts of each
composition of Bohm trees is also predictive. inductively active concept. Thus, in the process of
However, the process of LLM formation is forming GPN, as a structural reflection of LLM,
realized on the basis of determining the order the formation of logical expressions of a certain
set of statements is realized. Using the attributes According to the homotopy type theory [1, 2],
of each concept of these statements, it’s possible GPN is unilateral to the decision tree. Therefore,
to form a formal expression in the form of a record the representation of the GPN in the form of
of the algebra of statements calculus [3]. And the formulas with propositional variables, that are the
names in this expression will be the names of concepts of the GPN, can be represented in the
concepts. This determines that the GPN is form of the certain decision tree. Each formula of
structurally unique in the formula of the algebra propositional variables and logical operations that
of expressions, which is formed in terms of the is formed when interacting with the LLM of the
concepts of the GPN, that are propositional interactive knowledge base is determined by the
variables, using logical operations: conjunction hierarchy of the classification structure of the
“ ”, disjunction “ V ”, negation “ ” and subject of interaction. Depending on the attributes
following “ ”. of the concepts of active LLM, we obtain the
value of belonging of the propositional variable to
certain classes of concepts, and thus form a formal
2.2. The operational components of notation in the notation of the statements algebra
text transformation processes and further in the form of GPN.
The atypical nature of expressions (1) - (4),
All constructs of LLM, namely: statements, including the case of defining the contexts of
chains of knots of GPN, logical formulas are SSFL using propositional variables, means that
certain terms. Linguistic constructs from terms the type of meaning of these contexts isn’t
have an atypical representation and can also have important for calculations. They can be both
a propositional character, that determines the numerical and non-numerical. Moreover, the
nominal value of SSFL-concepts, which are logical expressions from propositional variables
interpreted by formulas in the notation of are quite stable to the order of their positioning in
statements algebra. Moreover, contexts that the formal expression, so they can occupy an
semantically define concepts that are arbitrary position in the record. Also, the values
propositional variables also characterize these that they receive in the calculation don’t require
concepts as dichotomous. This means that each determining the relationship of strict or non-strict
statement that is formed on the basis of the order. That’s, transformations (2) - (4) are always
concepts of the GPN is characterized by one of able to determine the truth and objectivity of LLM
two meanings, that is to answer arbitrary values [12].
questions in the format of “YES” or “NO”. Thus, the GPN is the primary LLM taxonomy
For expressions (1) - (4), this means that they of the narrative of the document being processed.
are significant in the case of “YES”, and may not The training sample, which is the primary basis of
be taken into account in the case of “NO”. That is, the process of machine learning of the interactive
provided that the contexts of the GPN form a true knowledge base, is formed from the concepts of
expression formula (2) - (4), an interactive this narrative. Then formed on this basis, the GPN
knowledge base is formed. If there is a case of provides a systematic reflection of all the
“NO”, which means that the true statements narratives that make up the primary information
haven’t been formed, IKB or a fragment of these base of the interactive knowledge system. The
GPN isn’t implemented. systemology of the interactive knowledge base
This greatly simplifies the formation of a follows from the systemology of LLM and GPN.
training sample for an interactive knowledge This provides a complete and correct
system. It can be based on concepts whose interpretation of the properties of all the concepts
significance in relation to the question of that make it up. And as a consequence, it
belonging to certain classes is true. That’s, to the implements the solution of problems of
question of the existence of the certain certainty classification of concepts that determine the latest
that the concept of GPN belongs to certain class nodes of GPN, diagnosing the states of all
or group of classes, we will always get the answer concepts on the basis of the formation of logical
“YES”. But it is clear that when the latest concepts formulas in the notation of the statements algebra.
are included in the GPN, we will receive answers Also, the systemology and dichotomy of
not only “YES” but also “NO”. And this propositional expressions from the concepts of
determines the conditions for expanding the GPN create conditions for predicting the presence
training sample of the intelligent system. of certain properties in the newly formed nodes of
GPN.
Prediction in our view of LLM can have a their relations. This is ensured by the following
truncated form of expression (2), which is procedural interpretation of the properties of the
supplemented by a representation of the form (6), GPNs themselves, as certain objects of a complex
namely: hierarchical structure.
X 1 , X 2 ,..., X n , a1 , a2 ,..., am 1) Formation of propositional expressions in
the notation of the algebra expressions that
(5) determine the classes of GPN concepts based on
X 1 , X 2 ,..., X n , a1 , a2 ,..., am , the optimal definition and selection of attributes
, (6) combinations that are significant in the interval of
a certain scale. At the same time, due to the
where the contexts for all SSFL-concepts are application of the operation “negation”, the
defined. In this case, the set of -terms includes procedure of minimizing the descriptions length
certain functional expressions that implement of each class defined in the GPN is also
predictive calculations [12, 16]. implemented.
The decision tree, that is based on the 2) Reliable classification of all concepts
relationship between the concepts of the GPN, is included in the training sample for GPN, and as a
a composition of Bohm trees, and can be consequence of the formation of propositional
converted into a propositional expression. Its expressions that dynamically reveal the patterns
elementary expressions, within the conditions of of both relevant classes of concepts and the
the specific problem, take the meaning of “true” relationship between them, while regulating the
or “denial”. The calculation of these values is compactness of the training sample, excluding
realized on the basis of determining the degree of quality assessment of patterns, that were
belonging of the attributes of the new concepts to discovered.
the characteristic descriptions that make up the 3) Defining the membership function, which
contexts of the educational sample. implements the mechanisms of fuzzy logic in
Expressions (5) - (6) define not only different calculating the characteristic characteristics of
functionalities, but also the systemic stability of GPN concepts and their classes, and obtaining
the latest concepts of GPN. To do this, the clear and fuzzy levels of reliability and their
procedure of discretization of -terms set is ranks, the validity of attribute features of concepts
determined, which implements the definition of and their properties and relationships, including
the corresponding numerical scales, that consist of zero value type “I don’t know”.
intervals characteristic of the contexts values of All these procedural actions ensure the
SSFL-concepts in a particular state. These formation of GPN and on its basis LLM, that
procedures also take into account the frequency determines the functional structure of the
distribution of concepts in different classes, interactive knowledge system. Based on them, the
thereby increasing their classification features in linguistic-semantic and conceptual analysis and
the GPN, and as a consequence, systemic processing of multilingual natural-language
accuracy. Another consequence is the formation narrative descriptions are realized in the
of more effective propositional expressions with environment of the specified system. The
the use of the latest concepts of the GPN, which selection of linguistic constructs of different
are unique to the decision tree, and as a result length and complexity, identification and
define more stable systemic rules. selection of intercontextual relations for all
BT M U x1 , x2 ,..., xn , (7) concepts that determine the semantic features of
GPN and LLM, including the educational sample,
a1 , a 2 ,..., a m is provided.
where BT (M ) according to [4] - the marked tree, GPN and as a consequence of LLM, that are
M - the term which has solvability, that is all built on the basis of the above-described machine
statements formed from its SSFL -concepts are learning procedures, are characterized by the
true. property of inductance. The further development
of GPN, based on the encapsulation of new
Thus, the interactive system of knowledge,
concepts, also expands the set of propositional
that is implemented on the basis of the formation
of GPN in the process of processing documents expressions, that are in fact certain linguistic
constructs, built on the application of logical
and narratives, is determined by the high stability
of the systemic features of the GPN concepts and operations to disordered elementary records -
statements that don’t have logical operators 3. Conclusions
inside.
This is functionally represented by expressions
The methodology and formation of growing
(2) - (7). When forming Bohm trees of the form
pyramidal networks constructively ensures the
(4) under conditions that the contexts of their
transformation of narrative texts into the format of
nodes determine only the true values, we
interactive knowledge bases. GPNs are able to
implement recursion from expressions (2) - (4).
determine the conditions for the stability of
The identification of intercontextual relations
information databases of interactive knowledge
in the process of the latest concepts encapsulation
systems, to implement the transformation into
and further inductive growth of the pyramidal
their formats of unstructured narrative
network, realizes the discovery of new statements
descriptions of various types, from scientific
as systems of knowledge. The intercontexts of the
articles to catalogs of scientific and technical
relationship are revealed through the logical
products, monographs and more.
operation “conjunction”, and the direct growth of
The conceptual basis of such transformations
GPN is realized by the use of logical operations
in the form of atypical expressions provides the
“disjunction”, “negation” and “following” both
implementation of intellectual services for
direct and reverse.
processing narratives by means of linguistic-
If we apply the rule of Godel's theorem on
semantic and conceptual analysis with their
incompleteness [4], we can determine that no
subsequent transformation into the format of
matter how many concepts aren’t encapsulated in
logical-linguistic models and interactive
GPN and LLM, and no matter how many of their
knowledge bases.
contexts in GPN aren’t related, GPN, LLM and
interactive knowledge system are never will be
complete. The result is the formation of 4. References
indeterminate nodes, which are the result of
applying the “conjunction” operation to selected [1] V. Voevodsky, Univalent Foundations of
sets of the training sample. Mathematics: Proceedings of Logic,
All undefined nodes are concepts of complex Language, Information and Computation,
structure. Their concepts, like linguistic WoLLIC 2011, in: Lev D. Beklemishev, Ruy
constructs, have logical operations inside them de Queiroz (Eds.), Lecture Notes in
and can therefore take the form of complex Computer Science, volume 6642, Berlin,
statements. Then such concepts can also be Heidelber, Springer, 2011, p. 311.
presented in the form of propositional doi:10.1007/978-3-642-20920-8.
expressions, that are able to define and classify the [2] Homotopy Type Theory: Univalent
latest concepts with a complex structure. Foundations of Mathematics, Princeton:
Also, uncertainty concepts based on the use of Institute for Advanced Study, 2013, 603 p.
inductance properties implement the clustering [3] van Dalen, Dirk (2013). Logic and Structure.
procedure, that provides identification of Universitext. Berlin: Springer.
semantically equivalent concepts and their doi:10.1007/978-1-4471-4558-5.
classes. The degree of this equivalence is [4] Dybjer, Peter & Kuperberg, Denis. (2012).
determined based on the application of the Formal neighbourhoods, combinatory Böhm
membership function. Depending on the trees, and untyped normalization by
significance of the degree of equivalence, the evaluation. Ann. Pure Appl. Logic. 163. 122-
concepts of uncertainty either form the newest 131. 10.1016/j.apal.2011.06.021.
class or are included in an existing thematic class. [5] V. Gladun, Processes of formation of new
After all, the measure of equivalence allows us knowledge, Sophia, SD "Teacher 6", 1994.
to apply the rule of logical inference “following” [6] O. Strizhak, Transdisciplinary integration of
by analogy. With a predetermined degree of information resources (Information
equivalence, it’s possible to draw conclusions Technology), PhD thesis, The National
about the belonging of new concepts and their Academy of Sciences of Ukraine, Inst. of
classes to those already defined, and also to Telecommunications and Global. inform.
determine the degree of certain statements Space, Kyiv, 2014.
validity that are formed on the basis of concepts [7] M. Dymarsky Ways to embody the
whose contexts are relevant. predicative relationship: Acta Linguistica
Petropolitana, in: Proceedings of the Institute
of Linguistic Research of the Russian Systems, volume 152, Springer, Cham, 2021.
Academy of Sciences, in: N. N. Kazansky doi.org/10.1007/978-3-030-58359-0_7.
(Eds.), T. XI. Part 1, Categories of nouns and [16] A. Gonchar, O. Strizhak, L. Berkman.
verbs in the system of functional grammar, Transdisciplinary consolidation of
in: M. D. Voeikova, E. G. Sosnovtseva information environments, Communication,
(Eds.), Nauka, 2015, pp. 41-62. № 1 (149), 2021, pp. 3–9.
[8] Understanding Predication. Series: Studies in
Philosophy of Language and Linguistics /
Edited By Piotr Stalmaszczyk // Peter Lang:
Frankfurt am Main, Bern, Bruxelles, New
York, Oxford, Warszawa, Wien, 2017. – 292
pp. DOI: https://doi.org/10.3726/b11243.
[9] О. Kulbabska, Modern interpretations of the
category of predication in linguistics,
Ukrainian language, № 1, 2009, p. 61-73.
ISSN 1682-3540.
[10] Ivanova K.B., Vanhoof K., Markov К.,
Velychko V. Introduction to the Natural
Language Addressing International Journal
"Information Technologies & Knowledge".
2013. Volume 7, Number 2. p. 139-146.
[11] N. B. Cocchiarella, Philosophical
Perspectives on Formal Theories of
Predication. In: Handbook of Philosophical
Logic. Synthese Library (Studies in
Epistemology, Logic, Methodology, and
Philosophy of Science), volume 167,
Springer, Dordrecht, 1989, pp. 253-326.
doi.org/10.1007/978-94-009-1171-0_3.
[12] V. Velichko, Logical-linguistic models as a
technological basis of interactive knowledge
bases, International Journal "Information
Models and Analyses", volume 8, number 4,
2019, pp. 3 25-340.
[13] O. Stryzhak, S. Dovgyi, M. Popova, R.
Chepkov, Transdisciplinary Principles of
Narrative Discourse as a Basis for the Use of
Big Data Communicative Properties, in: Arai
K. (Eds) Advances in Information and
Communication, FICC 2021, Advances in
Intelligent Systems and Computing, volume
1364, Springer, Cham, 2021.
doi.org/10.1007/978-3-030-73103-8_17.
[14] O. Stryzhak et al. Decision-making System
Based on The Ontology of The Choice
Problem, J. Phys.: Conf. Ser. 1828 012007,
2021. doi:10.1088/1742-
6596/1828/1/012007.
[15] S. Dovgyi, O. Stryzhak, Transdisciplinary
Fundamentals of Information-Analytical
Activity, in: M. Ilchenko, L. Uryvsky, L.
Globa (Ed.), Advances in Information and
Communication Technology and Systems.
MCT 2019, Lecture Notes in Networks and