<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">Correlation Analysis of Text Author Identification Results Based on N-Grams Frequency Distribution in Ukrainian Scientific and Technical Articles</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Victoria</forename><surname>Vysotska</surname></persName>
							<email>victoria.a.vysotska@lpnu.ua</email>
							<affiliation key="aff0">
								<orgName type="institution">Lviv Polytechnic National University</orgName>
								<address>
									<addrLine>S. Bandera Street, 12</addrLine>
									<postCode>79013</postCode>
									<settlement>Lviv</settlement>
									<country key="UA">Ukraine</country>
								</address>
							</affiliation>
							<affiliation key="aff1">
								<orgName type="institution">Osnabrück University</orgName>
								<address>
									<addrLine>Friedrich-Janssen-Str. 1</addrLine>
									<postCode>49076</postCode>
									<settlement>Osnabrück</settlement>
									<country key="DE">Germany</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Oksana</forename><surname>Markiv</surname></persName>
							<email>oksana.o.markiv@lpnu.ua</email>
							<affiliation key="aff0">
								<orgName type="institution">Lviv Polytechnic National University</orgName>
								<address>
									<addrLine>S. Bandera Street, 12</addrLine>
									<postCode>79013</postCode>
									<settlement>Lviv</settlement>
									<country key="UA">Ukraine</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Sofiia</forename><surname>Teslia</surname></persName>
							<email>sofiia.teslia.sa.2019@lpnu.ua</email>
							<affiliation key="aff0">
								<orgName type="institution">Lviv Polytechnic National University</orgName>
								<address>
									<addrLine>S. Bandera Street, 12</addrLine>
									<postCode>79013</postCode>
									<settlement>Lviv</settlement>
									<country key="UA">Ukraine</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Yeva</forename><surname>Romanova</surname></persName>
							<email>yeva.romanova.sa.2019@lpnu.ua</email>
							<affiliation key="aff0">
								<orgName type="institution">Lviv Polytechnic National University</orgName>
								<address>
									<addrLine>S. Bandera Street, 12</addrLine>
									<postCode>79013</postCode>
									<settlement>Lviv</settlement>
									<country key="UA">Ukraine</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Inesa</forename><surname>Pihulechko</surname></persName>
							<email>inesa.pihulechko.sa.2019@lpnu.ua</email>
							<affiliation key="aff0">
								<orgName type="institution">Lviv Polytechnic National University</orgName>
								<address>
									<addrLine>S. Bandera Street, 12</addrLine>
									<postCode>79013</postCode>
									<settlement>Lviv</settlement>
									<country key="UA">Ukraine</country>
								</address>
							</affiliation>
						</author>
						<author>
							<affiliation key="aff2">
								<orgName type="department">International Conference on Computational Linguistics and Intelligent Systems</orgName>
								<address>
									<addrLine>May 12-13</addrLine>
									<postCode>2022</postCode>
									<settlement>Gliwice</settlement>
									<country key="PL">Poland</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">Correlation Analysis of Text Author Identification Results Based on N-Grams Frequency Distribution in Ukrainian Scientific and Technical Articles</title>
					</analytic>
					<monogr>
						<imprint>
							<date/>
						</imprint>
					</monogr>
					<idno type="MD5">5ED4776A8CD8C2604D5B7B4AE5FF6CEC</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2023-03-24T12:55+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<textClass>
				<keywords>
					<term>N-Grams</term>
					<term>NLP</term>
					<term>correlation analysis</term>
					<term>authorship definition</term>
					<term>Ukrainian text</term>
					<term>distribution function density</term>
					<term>exponential and median smoothing</term>
					<term>linguometry</term>
					<term>stylometric analysis</term>
				</keywords>
			</textClass>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>The results of experimental approbation of the proposed content monitoring method used for the determination of the author style in Ukrainian scientific texts of technical profile have been studied. Authorship identification systems typically use plagiarism and rewrite metrics to determine it. There is a necessity to identify whether the work has been borrowed fully or partially. Therefore, the situation when the work has not been published yet is not taken into consideration. Quantitative content analysis of the scientific and technical texts uses the advantages of content monitoring and analysis of text based on NLP, Web-Mining and stylometry methods to identify many authors whose speech styles are similar to the studied passages. It narrows the search for further use in stylometric methods to determine the degree of the analyzed text belonging to a particular author. The method of determining the author has been decomposed on the basis of such speech coefficients analysis as lexical diversity, degree (measure) of syntactic complexity, speech coherence, indices of the text exclusivity and concentration. In parallel, the parameters of the author style, such as the text words, sentences, prepositions, conjunction quantities and the number of words with a frequency of 1, 10 or more have been analyzed.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.">Introduction</head><p>Due to the increasing availability and distribution of text documents in electronic form the importance of using automatic methods to analyze the content of documents has been increased <ref type="bibr" target="#b0">[1]</ref><ref type="bibr" target="#b1">[2]</ref><ref type="bibr" target="#b2">[3]</ref>. The tasks of text analysis include the necessity of documents classification and clustering <ref type="bibr" target="#b3">[4]</ref><ref type="bibr" target="#b4">[5]</ref><ref type="bibr" target="#b5">[6]</ref><ref type="bibr" target="#b6">[7]</ref> by various criteria, such as genre, writing format (novel, essay), emotional coloring, speech style, as well as the task of text author identification <ref type="bibr">[8 -14]</ref>.</p><p>With the simplification of access to various data, growth of the ability to search, copy and distribute data on networks, the task of identifying the author becomes urgent. Issues related to the determination of authorship are also important in linguistic, historical and forensic researches. The general availability of electronic devices allows to push the recognition of the author with the involvement of a large number of experts in the background, speed up and simplify this process through its automation.</p><p>The concept of author identification is defined as the process of author identification based on the set of the text general and particular features that constitute the author style <ref type="bibr" target="#b7">[8]</ref><ref type="bibr" target="#b8">[9]</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.">Related Works</head><p>Statistical methods based on the search for "author invariant" are popular in existing systems for determining the text authorship. "Author invariant" characterizes the text linguistic features (lexical, grammatical, phraseological and other ones). The invariant can be the following: the share of vowels or consonants, the frequency of certain part of speech use, the probability of transitions from one part of speech to another, "favorite" words, information entropy etc. Authors proposed a statistical method for determining the text author and genre based on the frequency distribution of letter combinations (ngrams) <ref type="bibr" target="#b9">[10]</ref><ref type="bibr" target="#b10">[11]</ref><ref type="bibr" target="#b11">[12]</ref>. This method has shown decent results for works of Slavic research publications. Unfortunately, the accuracy of determining authorship statistical methods depends on the data specifics using the language, style and length of written texts that have been studied <ref type="bibr" target="#b12">[13]</ref><ref type="bibr" target="#b13">[14]</ref><ref type="bibr" target="#b14">[15]</ref><ref type="bibr" target="#b15">[16]</ref><ref type="bibr" target="#b16">[17]</ref><ref type="bibr" target="#b17">[18]</ref><ref type="bibr" target="#b18">[19]</ref><ref type="bibr" target="#b19">[20]</ref><ref type="bibr" target="#b20">[21]</ref><ref type="bibr" target="#b21">[22]</ref><ref type="bibr" target="#b22">[23]</ref><ref type="bibr" target="#b23">[24]</ref><ref type="bibr" target="#b24">[25]</ref><ref type="bibr" target="#b25">[26]</ref><ref type="bibr" target="#b26">[27]</ref><ref type="bibr" target="#b27">[28]</ref>. Because of this, it is difficult to conclude the accuracy of such an approach to data of a different nature. For this reason, the aim of this work is to analyze the application of such a mathematical apparatus as the distribution of letter combinations for different languages in solving the problem of establishing the texts authorship of different lengths and written in different language styles. Chosen topic, namely relative frequency ngrams, is only gaining popularity in Ukraine and is not very popular. Several literature sources to describe what n-grams are and what they are used for have been found.</p><p>N-gram is a sequence of n-elements <ref type="bibr" target="#b28">[29]</ref>. From a semantic point of view, it can be a sequence of sounds, syllables, words or letters. In practice, N-grams are more common as a series of words, stable phrases that called collocations. A sequence of two consecutive elements is often called a bigram, a sequence of three elements is called a trigram, which have been presented in the studied dataset. At least four or more elements are denoted as N-grams, N is replaced by the number of consecutive elements. N-grams in general are used in a wide range of sciences. They can be used, for example, in the field of theoretical mathematics, biology, cartography, as well as in music. The most common uses of N-grams include the following: extracting data for the cluster of satellite images series of the Earth from space, decision which specific parts of the Earth are in the image, and searching for genetic sequences in computer compression for indexing data in search engines, using N-grams, usually indexed data related to sound. In natural language processing N-gram is used mainly for prediction based on probabilistic models. The N-gram model calculates the probability of the last word of the Ngram, if all the previous ones are known. When using this approach to modeling language, it is assumed that the appearance of each word depends only on previous words <ref type="bibr" target="#b29">[30]</ref>. Another application of N-grams is the detection of plagiarism. If you divide the text into several small fragments represented by Ngrams, they are easy to compare with each other, and thus obtain a degree of similarity of controlled documents <ref type="bibr" target="#b30">[31]</ref>. N-grams are often used successfully to categorize text and language. In addition, they can be used to create functions that allow you to gain knowledge from textual data. Using N-grams, you can effectively find candidates to replace words with spelling mistakes. Google Research Centers have used N-gram models for a wide range of research and development. These include projects such as statistical translation from one language to another, language recognition, spelling correction, information retrieval, and more. For the purposes of these projects were used text corpora, which contain several trillion words. Google has decided to create its own educational building. The project is called Google tera corpus and it contains 1,024,908,267,229 words collected from public websites <ref type="bibr" target="#b31">[32]</ref>.</p><p>For a long time, cryptograms decryption is aided by frequency analysis the essence of which is the study of statistical patterns of symbols appearance and their compounds in original and encrypted messages <ref type="bibr" target="#b0">[1]</ref>. In order to complicate frequency analysis ciphers have appeared in cryptography what leads to a uniform distribution of characters in the cryptogram. The principles of frequency analysis are widely used in password programs and allow to reduce the search time by several orders of magnitude <ref type="bibr" target="#b1">[2]</ref> based on classification and clustering <ref type="bibr" target="#b3">[4]</ref><ref type="bibr" target="#b4">[5]</ref><ref type="bibr" target="#b5">[6]</ref><ref type="bibr" target="#b6">[7]</ref> of documents by various criteria, such as genre, epoch, format (novel, essay), emotional coloring, speech style, as well as the task of determining the text author <ref type="bibr" target="#b7">[8]</ref><ref type="bibr" target="#b8">[9]</ref><ref type="bibr" target="#b9">[10]</ref><ref type="bibr" target="#b10">[11]</ref><ref type="bibr" target="#b11">[12]</ref><ref type="bibr" target="#b12">[13]</ref><ref type="bibr" target="#b13">[14]</ref><ref type="bibr" target="#b14">[15]</ref>.Obviously, frequency analysis requires first of all the reference frequencies of alphabet letters repetition on which the open texts are written and frequencies of N-grams repetition. For Ukrainian, English and almost all European languages the average frequency of letters, bigrams, trigrams repetition can be found in the literature <ref type="bibr" target="#b15">[16]</ref><ref type="bibr" target="#b16">[17]</ref><ref type="bibr" target="#b17">[18]</ref><ref type="bibr" target="#b18">[19]</ref><ref type="bibr" target="#b19">[20]</ref><ref type="bibr" target="#b20">[21]</ref><ref type="bibr" target="#b21">[22]</ref>.</p><p>Unfortunately, for the Ukrainian language only the frequency of letters repetition is given in the literature <ref type="bibr" target="#b22">[23]</ref><ref type="bibr" target="#b23">[24]</ref><ref type="bibr" target="#b24">[25]</ref>. Therefore, the purpose of this work is to investigate the repetition frequency of letters and letters of the Ukrainian language on the basis of randomly selected texts in the Ukrainian language of scientific and technical orientation. The analysis of the obtained data confirms that for the Ukrainian language, as well as for other European languages, the alternation of vowels and consonants is inherent. If you study other texts, there may differ in the numbers of the given letter's frequencies, which is explained, firstly, by the length of the studied text, and, secondly, by its subject matter. For example, the generally used letter F can become quite common in technical texts, because it is used in such words as function, differential, diffusion, coefficient, etc. Even greater deviations from the traditional use of individual letters are observed in some works of art, especially in poems.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.">Methods and Materials</head><p>Modern systems for determining the text authorship use different approaches to the theory of mathematical statistics, pattern recognition and probability theory, cluster analysis algorithms, neural networks and others <ref type="bibr" target="#b32">[33]</ref><ref type="bibr" target="#b33">[34]</ref><ref type="bibr" target="#b34">[35]</ref><ref type="bibr" target="#b35">[36]</ref><ref type="bibr" target="#b36">[37]</ref><ref type="bibr" target="#b37">[38]</ref><ref type="bibr" target="#b38">[39]</ref><ref type="bibr" target="#b39">[40]</ref><ref type="bibr" target="#b40">[41]</ref><ref type="bibr" target="#b41">[42]</ref><ref type="bibr" target="#b42">[43]</ref><ref type="bibr" target="#b43">[44]</ref><ref type="bibr" target="#b44">[45]</ref>. The systems differ in the method of author identification, the means of text analysis, the required text amount and accuracy <ref type="bibr" target="#b11">[12]</ref>. Methods of text authorship identification based on the calculation of any text characteristics (official parts of speech, prepositions, conjunctions, particles, independent parts of speech, nouns, verbs, adjectives, word lengths, sentence lengths) also differ in comparing frequencies in the different textual content for different tasks <ref type="bibr" target="#b45">[46]</ref><ref type="bibr" target="#b46">[47]</ref><ref type="bibr" target="#b47">[48]</ref><ref type="bibr" target="#b48">[49]</ref><ref type="bibr" target="#b49">[50]</ref><ref type="bibr" target="#b50">[51]</ref><ref type="bibr" target="#b51">[52]</ref><ref type="bibr" target="#b52">[53]</ref><ref type="bibr" target="#b53">[54]</ref><ref type="bibr" target="#b54">[55]</ref><ref type="bibr" target="#b55">[56]</ref><ref type="bibr" target="#b56">[57]</ref><ref type="bibr" target="#b57">[58]</ref><ref type="bibr" target="#b58">[59]</ref><ref type="bibr" target="#b59">[60]</ref><ref type="bibr" target="#b60">[61]</ref><ref type="bibr" target="#b61">[62]</ref><ref type="bibr" target="#b62">[63]</ref>. The most commonly used measures of comparing texts are the following: Information entropy, Fisher information, Chi-squared test and Kullback-Leibler divergence <ref type="bibr" target="#b8">[9]</ref><ref type="bibr" target="#b9">[10]</ref><ref type="bibr" target="#b10">[11]</ref><ref type="bibr" target="#b11">[12]</ref>.</p><p>When identifying the author of the text it is assumed that the text reflects the individual style of the text author, which allows to differ it from other ones. To compare the texts with each other it is necessary to compare the text with some numerical characteristic that was close to the texts of the same author and would different in the works of various authors. Such a characteristic of author in the article <ref type="bibr" target="#b8">[9]</ref><ref type="bibr" target="#b9">[10]</ref><ref type="bibr" target="#b10">[11]</ref><ref type="bibr" target="#b11">[12]</ref> uses the distribution function density (DFD) of letter combinations of three consecutive characters <ref type="bibr">(3grams)</ref>. DFD is defined as the set of empirical frequencies of birth of letters or their combinations. The analysis of the text with the help of DFD does not take into account the occurrence of punctuation marks, spaces and numbers.</p><p>The task of identifying the author of the unknown text in terms of DFD is formulated as follows.</p><p>Here is a set of texts that contain works of famous authors. Let 𝐿 𝑡 be the number of works by the ath author. 𝑁 𝑖,𝑡the number of symbols in the i-th work of the a-th author, 𝑖 = 1, … , 𝐿 𝑡 . All texts in this set will be presented in the form of DFD. DFD of the text, the volume of which is equal to 𝑁 𝑖,𝑡 , is given as the set of values 𝑝 𝑖,𝑡 (𝑗) = 𝑙 𝑗 /𝑁 𝑖,𝑡 , 𝑙 𝑗the number of N-grams under the number 𝑗. The argument 𝑗 = 1, … , 𝑓(𝑛, 𝑀)corresponds to the number of letters (n-grams) in alphabetical order, where 𝑀 is the power of the alphabet of the language in which the text is written, 𝑛 is the order of N-grams, i.e. the number of characters in letter combination. 𝑓(𝑛, 𝑀) = 𝑀 𝑛 is the number of N-grams in this alphabet.</p><p>Each author is identified with his weighted average DFD which is given by formula (1):</p><formula xml:id="formula_0">𝑃 𝑡 = 1 𝑁 𝑡 ∑ 𝑝 𝑖,𝑡 𝑁 𝑖,𝑡 𝐿 𝑡 𝑖=1 , 𝑁 𝑡 = ∑ 𝑁 𝑖,𝑡 𝐿 𝑡 𝑖=1<label>(1)</label></formula><p>These DFD will play the role of copyright standards <ref type="bibr" target="#b8">[9]</ref><ref type="bibr" target="#b9">[10]</ref><ref type="bibr" target="#b10">[11]</ref><ref type="bibr" target="#b11">[12]</ref>. To compare two texts, either the text and the author standard, it is needed to specify the distance between the corresponding distribution functions. The norm in the space of summed functions is used as a distance metric. For example, the distance 𝑤 0,𝑡 between the DFD of the unknown text 𝑝 0 and any copyright DFD will be calculated by formula (2):</p><formula xml:id="formula_1">𝑤 0,𝑡 = ‖𝑝 0 − 𝑃 𝑡 ‖ = ∑ |𝑝 0 (𝑗) − 𝑃 𝑡 (𝑗)| 𝑓(𝑛,𝑀) 𝑗=1 ,<label>(2)</label></formula><p>Accordingly, the text «0» will belong to the author whose distance to the DFD will be the shortest.</p><p>When solving the problem of classification, the data set was not clearly divided into test and training sets. Weighted average DFD were built on the whole set of books by one author. The distance from the book 𝑖 to "his" a -author is calculated by formula (3):</p><formula xml:id="formula_2">𝑤 𝑖,𝑡 = ‖𝑝 𝑖,𝑡 − 𝑃 𝑡 ‖ 1 − 𝑁 𝑖,𝑡 /𝑁 𝑡 .<label>(3)</label></formula><p>Formula (4) excludes the participation of the DFD of the document / i-article in the average DFD of "its" author <ref type="bibr" target="#b8">[9]</ref><ref type="bibr" target="#b9">[10]</ref><ref type="bibr" target="#b10">[11]</ref><ref type="bibr" target="#b11">[12]</ref>. The method of smoothing 3-gram distribution functions according to the analytical approach is impossible because the function is too complex. There is only an algorithmic approach for the implementation of which we can be focused on the main methods, such as simple or ordinary moving average, weighted moving average, exponential smoothing, median smoothing. In our case, we believe that the most optimal will be the use of the moving average method and this method is also known as the filtering method. Its application will reduce the variety of data. This fits into our analytically chosen tactics of ignoring extreme data, highs and lows. The degree of smoothing should be tied in advance to the criterion that will ensure maximum smoothing while still retaining information.</p><p>In our specific case we believe that correlation analysis of 3-gram data sequences in certain two of three selected articles can help to determine the relationship, and thus help to answer the question of how similar is the topic of articles. To do this, the function of the first studied article can be denoted by the variable x and the set of values of the second article (variable y) and perform a correlation analysis of the set of two sequences XY. The task of correlation is not only to assess connectivity, but also to reduce the target score to a numerical expression. The method of studying 3-gram sequences allows to reduce significantly the number of variables that are taken into account as important ones. Combination of the metrics-related groups forms a new cluster, which compares the metrics of closeness with others, and it is possible to end up with a fairly clear structure of the data set. Quantitative method of the potential text author identification from the set of possible ones on the basis of comparison analysis results of the reference text with the researched one is based on the technique of linguometry.</p><p>Linguometry is a branch of applied linguistics that detects, measures and analyzes the quantitative characteristics of different levels units of language or speech <ref type="bibr" target="#b32">[33]</ref>. Using the apparatus of mathematical statistics, linguometry is involved in solving such problems of linguistics as the following criteria:</p><p>• dictionaries (including frequency and statistical) and comparisons • automatic dictionaries, thesauri • shorthand systems • methods and means of automatic language detection • methods and means of information retrieval, etc. Each language has its own statistical parameters and knowledge of the frequency occurrence of letters and their combinations (2-gram, 3-gram, 4-gram) that allows automatically to identify it. For example, for Ukrainian texts <ref type="bibr" target="#b33">[34]</ref><ref type="bibr" target="#b34">[35]</ref><ref type="bibr" target="#b35">[36]</ref><ref type="bibr" target="#b36">[37]</ref> it was found that statistical parameters of styles can consider frequencies of vowels, consonants, spaces between words, as well as soft and sonorous groups of consonants <ref type="bibr" target="#b32">[33]</ref>. We will show how to evaluate the speech of a particular author on a particular passage of his work <ref type="bibr" target="#b35">[36]</ref> using a certain standard, for example, Ukrainian language letters frequencies. Consider two passages of the technical text in Ukrainian presented in a format where the letters are arranged in descending order of frequency of their appearance (frequencies are given in Table <ref type="table" target="#tab_0">1</ref>), the distinction between lowercase and uppercase letters has not been made. The type of letters correlation frequencies of the passages <ref type="bibr" target="#b34">[35]</ref> and the standard <ref type="bibr" target="#b35">[36]</ref> have been investigated. The results that confirm the conclusions have been presented, in particular, graphically. In the table. 1 the following data are entered for convenience: frequency of used Ukrainian language letters, absolute and relative frequencies of letters used in the studied Passage 1 (Article 1) <ref type="bibr" target="#b34">[35]</ref> and Passage 2 (Article 2) <ref type="bibr" target="#b35">[36]</ref>. Passage 1 contains 556 characters; Passage 2 contains 541 characters. The concept of "other" in the column of letters contains authentic letters for the Ukrainian language (ї, є, г, і), which are rarely used in most technical texts. This allows to achieve some independence in the analysis. Fig. <ref type="figure" target="#fig_6">1</ref> illustrates the obtained results graphically.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Figure1:</head><p>The relative frequencies of letters in the standard and the studied passages Graphical representation of the relative frequencies of letters in the passages gives a convincing answer to the question which of the passages was written by which author.</p><p>The distribution of 1-gram in the works is different. The optimal indicators of the texts study are the analysis of 3-grams <ref type="bibr" target="#b37">[38]</ref><ref type="bibr" target="#b38">[39]</ref><ref type="bibr" target="#b39">[40]</ref><ref type="bibr" target="#b40">[41]</ref><ref type="bibr" target="#b41">[42]</ref><ref type="bibr" target="#b42">[43]</ref><ref type="bibr" target="#b43">[44]</ref>. We will check this in the next stages of the study. There is a sharp jump in the relative frequency of occurrence of the letter "e" for Passage 2 relative to the reference values of Standard 1 <ref type="bibr" target="#b35">[36]</ref> (Fig. <ref type="figure" target="#fig_26">2</ref>), so we assume that it is more likely that Standard 1 was written by the author of Passage 1 <ref type="bibr" target="#b34">[35]</ref>. We also give the numerical values of the correlation of the frequency of letters in the passages and the standard. We find two correlation coefficients: for the standard and Passage 1 <ref type="bibr" target="#b34">[35]</ref> and for the standard and Passage 2 <ref type="bibr" target="#b36">[37]</ref>; factor closer to 1 will indicate that the relevant passage is more likely to belong to the standard. Calculations of the correlation coefficient for the standard and Passage1 give Re-У1=0.962716, and the correlation coefficient for the standard and Passage 2 -Re-У2=0.909958. Similarly, the values of relative frequencies in Standard 2 and Passages 1, 2 in Fig. <ref type="figure">3</ref> differ significantly, so it is likely that the author of Standard 2 <ref type="bibr" target="#b33">[34]</ref> is not the author of Passages 1 and 2. The obtained values of the coefficients, as well as the analysis of the graphical results allow to state that the probability of belonging of Section 1 <ref type="bibr" target="#b34">[35]</ref> to Standard 1 <ref type="bibr" target="#b35">[36]</ref> is higher than for Section 2 <ref type="bibr" target="#b33">[34]</ref>.To achieve the research goal a system with the ability to select the language / languages of the analyzed content have been developed and implemented on the Victana web-resource <ref type="bibr" target="#b62">[63]</ref>. For high-quality and effective analysis of content in determining the degree of authorship of a particular person, we propose to analyze the reference text and the study in several stages:</p><p>• Linguometric analysis of the coefficients of diversity of the author's speech (Fig. <ref type="figure" target="#fig_8">4</ref>, Alg. 1); • Stylometric analysis (Fig. <ref type="figure" target="#fig_9">5</ref>); • Analysis of stable phrases (Fig. <ref type="figure" target="#fig_10">6</ref>); • Linguistic and statistical analysis through N-grams (Fig. <ref type="figure" target="#fig_11">7</ref>). The Web-resource for linguometric analysis has the following fields (Fig. <ref type="figure" target="#fig_8">4</ref>):</p><formula xml:id="formula_3">•</formula><p>Content -is a field where the researched text is copied from the buffer; • Signs (the entered text must contain at least 100 and at most 10000 characters) is the maximum size of the content is a set; • Calculation is meaning its start; • clearance is clear the entered data. Algorithm 1. Linguometric analysis of the text to determine authorship.</p><p>Step. 1. Check the length of the text -the excess is cut off.</p><p>Step. 2. Determine the number of sentences.</p><p>Step. 3. Purify the studied text (numbers, special symbols).</p><p>Step. 4. Determine the total number of words in the text N.</p><p>Step. 5. Determine the number of words W.</p><p>Step. 6. Determine the number of prepositions Z. Step. 7. Determine the number of connectors S.</p><p>Step. 8. Calculate the coefficients of author speech.</p><p>Step. 9. Output the results to the end user (Table <ref type="table" target="#tab_1">2</ref>, Fig. <ref type="figure" target="#fig_6">1</ref>). The Web-resource for stylistic analysis has the following fields (Fig. <ref type="figure" target="#fig_9">5</ref>):</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Figure4: The example of linguistic analysis application result</head><formula xml:id="formula_4">• Select Passage 1 (2, 3</formula><p>) is open access to excerpts. Access to the next passage only after activating access to the previous one. Access is opened sequentially from a smaller number to a larger one.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>•</head><p>Reference text is the field where the Reference text is copied from the buffer.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>•</head><p>The text you enter must be at least 100 characters long. (Now 0) is after starting the calculation, the actual number of characters of each passage will be calculated and displayed separately.</p><p>• Passage 1 (2, 3) is the field where the corresponding excerpt text is copied from the buffer.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>•</head><p>Calculate is start the calculation.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>•</head><p>Clear is clear the entered data.</p><p>Algorithm 2. Stylometric analysis of the text to determine authorship.</p><p>Step. 1. Check the lengths of standard text and selected passages and reduce the length of the reference text to the minimum of the checked.</p><p>Step. 2. Clean the reference text from special characters, etc.</p><p>Step. 3. Determine of the words number in the text of the standard.</p><p>Step. 4. Determine the number of stop words (prepositions + conjunctions + particles) in the text of the standard (Fig. <ref type="figure" target="#fig_10">5-6</ref>).</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Figure5: Example of data entry for stylometric analysis</head><p>Step. 5. The length of Passage 1 is not more than the minimum text.</p><p>Step. 6. Clear Passage 1 from special characters, etc.</p><p>Step. 7. Determine the number of words W1 for Passage 1.</p><p>Step. 8. Determine the number of stop words (prepositions + conjunctions + particles) in the text.</p><p>Step. 9. Prepare individual arrays (excerpt and standard) to calculate the correlation coefficient (Fig. <ref type="figure" target="#fig_10">6</ref>).</p><p>Step. 10. Call the function to calculate the correlation coefficient.</p><p>Step. 11. Form an array to form a graphical representation of the relative frequency of stop words in Passage 1 and in the standard.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Figure6: Example of stylometric analysis application results</head><p>Step. 12. Call the function to calculate the relative frequency distribution graph (Fig. <ref type="figure" target="#fig_10">6</ref>).</p><p>Step. 13. Call the function to calculate the correlation coefficient of Passage 2 (3) for each of the service words.</p><p>Step. 14. Form the words of the Swadesh list from the reference book, determine the number of words from the Swadesh list in the text of the Passage.</p><p>Step. 15. Form common for the Standard, Passages 1-3 and the Swadesh list.</p><p>Step. <ref type="bibr" target="#b15">16</ref>. The results of the study are displayed on the screen.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.">Experiment</head><p>When identifying the author of the text, it is assumed that the text reflects the individual style of author writing what distinguishes him from others. In order to compare texts with each other, it is necessary to compare the text with some numerical characteristics that would be close to the texts of the same author and would be significantly different for the works of different authors.</p><p>The Web-resource for the analysis of N-grams has the following fields (Fig. <ref type="figure" target="#fig_11">7</ref>):</p><p>• Number of grams -the number of characters in grams. Default is 3 ones. Can be changed to 1, 2, 3, 4.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>•</head><p>Choice of the text language -the language of the text for analysis (research). The default one is "Ukrainian".</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>•</head><p>Text -a field where the studied text is copied from the buffer.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>•</head><p>Restriction of text in characters.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>•</head><p>Generation -to start generating N-grams.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>•</head><p>Clearance -clear the entered data.</p><p>Algorithm 3. Linguistic and statistical analysis of N-grams of text is the following:</p><p>Step. 1. Purify the studied text (numbers, special symbols).</p><p>Step. 2. Calculate the number of words in the text.</p><p>Step. 3. All words of the text are translated in lower case.</p><p>Step. 4. Remove the spaces.</p><p>Step. 5. Depending on the selected language, the corresponding alphabet is substituted.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Figure7: Example of N-gram text analysis application</head><p>Step. 6. Depending on the set number of grams the corresponding function which calculates all possible variants of grams and saves in an array is started.</p><p>Step. 7. The function of counting the number of occurrences of words is started.</p><p>Here we calculate the relative frequency of occurrence and store it in the array: the ordinal number of the gram, the gram itself, the number of occurrences of this gram, the relative occurrence frequency of this gram.</p><p>Step. 8. The following function forms the array received in the previous function for export to the CSV file. This file is stored on the server. It can be downloaded to the computer of the user (researcher) via the link, which will be accessible after the formation of the form with the results of the study.</p><p>Step. 9. The results of the study are displayed on the screen (only those grams that are found in the text).</p><p>Step. 10. Access the export file.</p><p>Step Three publications of scientific and technical orientation on the basis of linguistic and statistical analysis of 3-grams have been compared. Articles 1 and 2 have been written by one team, Article 3 has been written by another author (Table <ref type="table" target="#tab_2">3</ref>). The language of the text is Ukrainian (letters in the alphabet -33, all possible N-grams -35937) When comparing articles only those 3-grams that are found in the text at the same time in three articles at least once have been taken into account. Therefore, for this particular example, all 3-grams are 2147. That is, for Article 1 78.4814% 3-grams have been analyzed, for Article 2 -72.6332% and for Article 3 -84.1271%. Accordingly, the difference in consumption of the relevant 3-grams between Articles 1 and 2 is R12=56.5254 %, between Articles 2 and 3 -R23=69.4271 %, between Articles 1 and 3 -R13=62.9839 %. These indicators themselves show that the characteristics of Articles 1 and 2 are more similar (R23&gt;R12on 12.9017 %, R23 &gt; R13on 6.4432 %, R13&gt; R12on 6.4585 %, that is R23&gt;R13&gt;R12) than the characteristics under Articles 1-3 and 2-3. The smaller the Rij, the greater the degree that the articles are written by the same author. Then in the case of Articles 1 and 2 it is more likely to be written by one author / team than Articles 2-3 and Articles 1-3 respectively. But we will analyze the use of individual clusters of 3-grams in the relevant articles and compare the results.</p><p>Fig. <ref type="figure" target="#fig_12">8</ref> presents the results of the analysis of use in Articles 1-3 of 3-grams, starting with the letter a (appearance in Articles 1-3 in the range of 6.1125-6.7087%). Most often the curve lines for Articles 1-2 (4.2322%) and Articles 1-3 (4.197%) coincide or approach each other (average discrepancy is 0.02713% and 0.0269%, respectively). But not always there is a coincidence with Article 2-3 (4.6322%) and there are significant differences (the average difference is 0.02969%). If you analyze only such 3grams it turns out that all three articles are written more likely by one author. This is due to the fact that this letter is one of the most commonly used for the formation of Ukrainian words.</p><p>Figure8: The use of 3-grams, starting with the letter a (Article 1 -blue, Article 2 -red, Article 3 -green) Fig. <ref type="figure">9</ref> presents the analysis results of use in Articles 1-3 of 3-grams, starting with the letter б (letter b in English) (appearance in Articles 1-3 in the range of 0.48884-0.77738%). Most often the curve lines for Articles 1-2 (0.594%) as opposed to Articles 1-3 (0.7072%) and Articles 2-3 (1.1208%) coincide or approach. But the trajectory of the curve of Article 1 and Article 3 often coincides (most likely articles are written by one author, the average discrepancy is 0.01809%, while for Articles 1-2 -0.0261% and Articles 2-3 -0.02866%. If analyze only such 3-grams (which are less common), it turns out that all Articles 1-2 are written more likely by one author, and Article 3 -by another one. This is due to the fact that this letter would be rare in the formation of Ukrainian words. And some authors use such words more often because of habit and / or because of the subject matter of their publications (this requires further research).</p><p>Figure9: The use of 3 grams, starting with the letter б (Article 1 -blue, Article 2 -red, Article 3 -green)</p><p>According to Table <ref type="table" target="#tab_3">4</ref> and Fig. <ref type="figure" target="#fig_6">10</ref>-12, a part of the letters in the Ukrainian language are most often used, others -much less often. For the most frequently used letters, the frequency of occurrence of 3grams with such initial letters will have almost the same distribution (top values in the graph of Fig. <ref type="figure" target="#fig_26">12</ref>), and not for other letters.  Therefore, it is advisable to study only the trigrams for the initial letters, which are less common in the texts of a particular language to determine the degree of belonging of the text to the author (for example, Fig. <ref type="figure" target="#fig_26">12</ref>). Thus, for 3-grams of the letter are (the appearance in Articles 1-3 in the range of 0.2517-0.707%) most often the lines of curves for Articles 1-2 (0.2508%) in contrast to Articles 1-3 (0.6077 %) and Articles 2-3 (0.5443%) that coincide or approach each other. But the trajectory of the curve of Article 1 and Article 2 often coincides (most likely articles written by one author -the average discrepancy is 0.0114%, while for Articles 2-3 -0.02478% and Articles 1-3 -0.02762% this value is higher twice as much).</p><p>Figure12: The use of 3-grams, starting with the letter є (Article 1 -blue, Article 2 -red, Article 3green)</p><p>Table <ref type="table" target="#tab_3">4</ref> shows frequencies of letters appearance in the standard and the studied passages. Fig. <ref type="figure" target="#fig_16">14</ref> shows histograms of the relative frequency of n-grams in 1-3 articles. Low frequency (noise) values are the most common and form the main volume of the data. We can ignore them (Fig. <ref type="figure" target="#fig_9">15</ref>).  All graphs of the distribution of the frequency of 3-grams in articles show a significantly noticeable gradation of 3-grams on underused (noise-like) and widely used peak values. This allows to see the specific examples of the three articles, the fact that to reduce the amount of information analyzed it is desirable to proceed to the analysis of the distribution function from a certain threshold value of frequency and at the same time cover the main information content is visible. To compare the distribution function in the context of the three studied articles, it is necessary to compare clearly expressed average values. After analyzing the most commonly used 3-grams, we conclude that they are caused by the stylistics or grammar of the Ukrainian language and are not relevant to determine the specific topic of articles. The most used 3-grams in Article: </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.">Results</head><p>In the algorithmic approach, the appearance of the trend is obtained due to various algorithms that practically implement smoothing procedures. These procedures provide the researcher only with an algorithm for calculating the new value of the time series at any given time t. These methods can be classified as the following simple or ordinary moving average (Fig. <ref type="figure" target="#fig_10">16</ref>), weighted moving average, exponential smoothing -median smoothing. In this part of the calculation work, the relative frequency of consumption of 3-grams in three texts has been smoothed by the method of moving average, exponential smoothing and median smoothing.</p><p>The moving average method is one of the oldest known methods of smoothing the time series. It is based on the transition from the initial values of the series to their average values in the time interval, the length of which is selected in advance. The selected time interval slides along the row. Moving averages can smooth out both random and periodic fluctuations, identify existing trends in the process and therefore serve as an important tool in filtering time series components. The moving average method estimates the average level over a period of time. The longer the time interval to which the average belongs, the smoother the level will be, but the less accurately the trend of the original time series will be described. In all figures, the gray graph is the graph of the initial Relative frequency, and the red graph is the graph of the smoothed Relative frequency data.</p><p>At small values of the size of the interval w, the efficiency in terms of smoothing effect is not very high, as can be seen in the following Figures <ref type="figure" target="#fig_12">16-18</ref> for Article 1 (smooth the data using the size of the smoothing interval w = 3, 5, 7, 9,11, 13, 15).  It is needed to smooth the data using the size of the smoothing interval w = 3 (Fig. <ref type="figure" target="#fig_19">19</ref>), then smooth the obtained data again but using the size of the smoothing interval w = 5. Then continue smoothing the obtained data with the smoothing interval w = 7 and so on to w = 15 (Fig. <ref type="figure" target="#fig_26">20-21</ref>). At small values of the size of the interval w, the efficiency in terms of the smoothing effect decreases, which can be seen in the following figures. Also, the smoothing method, using pre-smoothed rows, smoothest the data very effectively.  We smooth the data using the dimensions of the smoothing interval w = 3, 5, 7, 9, 11, 13, 15 for Article 2, the moving average showed a trend in the interval better than for Article 1 (Fig. <ref type="bibr" target="#b21">[22]</ref><ref type="bibr" target="#b22">[23]</ref><ref type="bibr" target="#b23">[24]</ref>.  It is needed to smooth the data using the size of the smoothing interval w = 3 for Article 2 (Fig. <ref type="bibr" target="#b24">[25]</ref><ref type="bibr" target="#b25">[26]</ref><ref type="bibr" target="#b26">[27]</ref>, then smooth the obtained data again but using the size of the smoothing interval w = 5. Then continue smoothing the obtained data with the smoothing interval w = 7 and so on to w = 15.    It is needed to smooth the data using the size of the smoothing interval w = 3 (Fig. <ref type="figure" target="#fig_6">31</ref>), then smooth the obtained smoothed data again but using the size of the smoothing interval w = 5. Then continue smoothing the obtained data with the smoothing interval w = 7 and so on to w = 15 (Fig. <ref type="figure" target="#fig_26">32-33</ref>).  For data on the frequency of 3-grams use exponential smoothing for all three texts did not give a "delay". Exponential smoothing has softened these data a little and it is harder to see the general trend. Also, the correlation coefficients of the data are very low (Fig. <ref type="figure" target="#fig_26">34-42</ref>).   Median smoothing. In this case, use the same dimensions of the smoothing interval and the operation as in paragraph 1.Characteristic feature of median smoothing is that it leaves monotonic parts of the data sequence and sharp differences unchanged, and for nonmonotonic areas within the size of the sliding smoothing interval leaves only a centered value equal to their median, i.e. effectively eliminates those levels that violate monotonicity. It is needed completely to eliminate single extreme or anomalous values of levels that are at least half the distance from the smoothing interval, maintain sharp differences in trends (moving average and exponential smoothing lubricates them), effectively eliminates single levels with very large or very small values that are random and stand out sharply among other levels. These characteristics of the median smoothing were confirmed during the median smoothing for relative frequency in Article 1. lower than the moving average. We smooth the data using the dimensions of the smoothing interval w = 3, 5, 7, 9, 11, 13, 15 (Fig. <ref type="bibr" target="#b42">[43]</ref><ref type="bibr" target="#b43">[44]</ref><ref type="bibr" target="#b44">[45]</ref>. Graphics are arranged in the appropriate order: Median w = 3 (for intervals 0-700, 700-1400, 1400-2148), Median w = 5, 7, 9, 11, 13, Median w = 15 (for intervals 0-700, 700-1400, 1400-2148). It is needed to smooth the data using the size of the smoothing interval w = 3, then smooth the obtained smoothed data again but using the size of the smoothing interval w = 5. We continue smoothing the obtained data with the smoothing interval w = 7 and so on to w = 15 (Fig. <ref type="bibr" target="#b45">[46]</ref><ref type="bibr" target="#b46">[47]</ref><ref type="bibr" target="#b47">[48]</ref>.  </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6.">Discussions</head><p>Graphical representation of the relationship between two studied sequences is called a correlation field or scatter plot. The graphical method provides a visual representation of the form of communication between these sequences. So, it is needed to construct a correlation field for Article 1 and 2 (Fig. <ref type="figure" target="#fig_8">49</ref>), Article 1 and 3 (Fig. <ref type="figure" target="#fig_9">50</ref>), Article 2 and 3 (Fig. <ref type="figure" target="#fig_9">51</ref>). Visually assessing the nature of the relationship, it can be stated that there is a linear relationship in all three fields. Also evaluating the visual data of the field, we see that the correlation is present, so we can assume that these Ukrainian articles can be written by one author or are based on one topic. But visual assessment is not enough, so it is worth finding the value of the correlation coefficient for more accurate research results. The correlation coefficient characterizes the degree of closeness of the linear dependence. Therefore, there is a calculation of the correlation coefficients for Articles 1 and 2 (Correlation coefficient 0.575. Coefficient of determination 33%); for Articles 1 and 3 (Correlation coefficient 0.63023. Coefficient of determination40%); for Articles2 and 3 (Correlation coefficient 0.49038. Coefficient of determination24%). Correlation coefficients that are less than 0.7 but greater than 0.5 modulus indicate a medium-strength relationship (the coefficients of determination are less than 50% but more than 25%). It is worth noting that in the first two cases we received a connection of medium strength and in the third case we have a connection of weak force very close to the average, so it can also be attributed to the average. It is obvious that having three different Ukrainian articles the 100% correlation is unlikely to be. So, given the average connection, the assumption that these articles may have been written by the same author or are based on the similar topics has been confirmed. When the pair statistical dependence on the linear correlation is rejected, the correlation coefficient loses its meaning as a characteristic of the degree of closeness of the connection. In this case, such a measure of communication as the correlation ratio is used. Since there is a linear relationship between the pair of studied features, the correlation ratio does not need to be calculated.</p><p>Autocorrelation function is a correlation of function with itself shifted by a certain amount of independent variables. Autocorrelation is used to find patterns in a number of data, such as periodicity.</p><p>The graph of the autocorrelation function is also called the correlogram (Fig. <ref type="figure" target="#fig_26">52</ref>).</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Figure 35: Correlogram</head><p>Fig. <ref type="figure" target="#fig_26">52</ref> shows that the studied series are not stationary, as in the case of fixed time series the graph of autocorrelation functions should be decreased rapidly after the first few values.</p><p>It is needed to divide the sequence of Relative frequency Article 1 into three equal parts of 715 values. For convenience, we take the data into a separate table (Fig. <ref type="figure" target="#fig_9">53</ref>). The correlation matrix is a square table in which the correlation coefficient between the corresponding parameters is located at the intersection of the corresponding row and column. Correlation matrix for column divided into 3 parts and has been constructed and the results are obtained: correlation coefficients, that are less than 0.5, the absolute value or modulus indicate a weak relationship. On the correlation matrix it is seen that all values are close to 0, so we can conclude that there is no connection at all. It can be said that this is quite an expected result, as the data do not depend on each other and have different values. We find the coefficients of multiple correlation (Fig. <ref type="figure" target="#fig_9">54-55</ref>).  According to these graphs, Article 1 and Article 2 were more likely to have been written by one author, although Article 1 and Article 3 could also have been written by one author (but this is not true). But Articles 2-3 were definitely written by different authors. The application of linguistic and statistical analysis of 3-grams to a set of articles will allow to form a subset of similar linguistically characteristic publications. Imposing additional conditions on this subset in the form of statistical and quantitative analyzes (sets of keywords, stable phrases, stylistic, linguometric, etc.) will significantly reduce this subset, clarifying the list of more likely author works. Thus, the analysis of the content and frequency of occurrence of only business words will separate Articles 1 and 3 into different subsets, Articles 1 and 2 in one the same. This study does not address the problem of identifying the author in full due to the fact that the difference in authorial traits is subjective and depends on the limitations imposed on the creative process of the author. However, as a result, a system that implements such methods is able to give recommendations on the degree of belonging of the text to a particular author. Further experimental research is needed to test the proposed method to determine the style of the author from other categories of texts -scientific humanities, art, journalism and more. Therefore, we compare the frequencies of all trigrams that begin with a particular letter (Fig. <ref type="figure" target="#fig_10">56</ref>). According to these graphs, Article 1 and Article 2 were more likely to have been written by one author, although Article 1 and Article could also have been written by one author (but this is not true). But Articles 2-3 were definitely have been written by different authors. The application of linguistic and statistical analysis of 3-grams to a set of articles will allow to form a subset of similar linguistically characteristic publications. Imposing additional conditions on this subset in the form of statistical and quantitative analyzes (sets of keywords, stable phrases, stylistic, linguometric, etc.) will significantly reduce this subset, clarifying the list of more likely author works.</p><p>Thus, the analysis of the content and frequency of occurrence of only business words will separate Articles 1 and 3 into different subsets, Articles 1 and 2 in one the same.</p><p>This study does not address the problem of identifying the author in full due to the fact that the difference in authorial traits is subjective and depends on the limitations imposed on the creative process of the author. However, as a result, a system that implements such methods is able to give recommendations on the degree of the text belonging to a particular author. Further experimental research needs to test the proposed method to determine the style of the author from other categories of texts such as scientific humanities, art, journalism and others.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="7.">Conclusions</head><p>The article dwells upon the completed scientific research in the field of information technology in the part concerning computer linguistics, artificial intelligence and Machine Learning. Correlation analysis of text author identification results based on n-grams in Ukrainian technical and scientific texts have been made. The comparison between three articles have been done and the results have been obtained. Quantitative content analysis of textual scientific and technical content has been studied based on the fact that text authorship determination systems typically use plagiarism and rewrite its metrics of identification fully or partially. The article presents the method of determining the author by decomposition on the basis of the analysis of such speech coefficients as lexical diversity, degree of syntactic complexity, speech coherence, indices of exclusivity and concentration of the text. Also, the parameters of the author style such as words, sentences, prepositions, conjunctions numbers and quantity of words with defined frequencies have been analyzed. It is highlighted that in the algorithmic approach smoothing procedures are widely used. So, the relative frequency of 3-grams consumptions in the studied texts has been smoothed by the method of moving average, exponential and median smoothing. It is proposed to analyze the reference text in several stages for high-quality and effective analysis of content in determining the degree of text authorship. To achieve the research goal a system with the ability to select the language / languages of the analyzed content have been developed and implemented on the Victana Web-resource. It is said that in order to compare the texts with each other it is necessary to compare the text with some numerical characteristic that was close to the texts of the same author and would different in the works of various authors that uses the distribution function density of letter combinations of three consecutive characters. So, rapid distribution of text documents in electronic form has caused the importance of using automatic methods to analyze the content including the necessity of documents classification and clustering by various criteria. </p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>Figure2:</head><label></label><figDesc>Figure2: The relative frequencies of occurrence of the ten most frequent symbols in Standard 1 and the studied Excerpts 1, 2, including omission</figDesc><graphic coords="6,126.32,147.89,342.22,130.45" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_2"><head>. 11 .</head><label>11</label><figDesc>The generalized results are deduced: • only N-grams with repetitions were found • only N-grams were found without repetitions • total N-grams • number of characters in the text that are completely cleared • number of characters in the text with spaces • number of words in the text • size of the alphabet.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_3"><head></head><label></label><figDesc>аги ади адр аєз ажа азо айв акі алг аль ами анд ант аоп апо ари асі асу атн афо ахо ацю ачу аги ади адр аєз ажа азо айв акі алг аль ами анд ант аоп апо ари асі асу атн афо ахо ацю ачу ||p1</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_4"><head>Figure10: 3 Figure 1 : 3 Figure12:</head><label>313</label><figDesc>Figure10: Relative frequency for Article 1-3</figDesc><graphic coords="13,225.50,542.93,144.97,149.20" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_5"><head>2 а б в г д е є ж з и і ї й к л м н о п р с т у ф х ц ч ш щ ь юFigure14: 3 Figure15:</head><label>23</label><figDesc>Figure14: Histogram of the relative frequency of N-grams in Articles 1-3</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_6"><head>• 1 :</head><label>1</label><figDesc>ння [nnya] 0.008476, енн [enn] 0.007175, ого [oho] 0.005473. • 2: ння [nnya] 0.006448, ист [yst] 0.006356, ува [uva] 0.006233. • 3: ння [nnya] 0.008769, ого [oho] 0.007717, мет [met] 0.006314.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_7"><head>Figure 2 :Figure 3 :</head><label>23</label><figDesc>Figure 2: Moving Average of Article 1 for w=3 for the interval 0-700, 700-1400 and 1400-2100</figDesc><graphic coords="16,72.00,316.43,451.50,83.25" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_8"><head>Figure 4 :</head><label>4</label><figDesc>Figure 4: Moving Average of Article 1 for w=15 for the interval 0-700, 700-1400 and 1400-2100</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_9"><head>Figure 5 :</head><label>5</label><figDesc>Figure 5: Moving Average of article 1 for w=3 for the interval 0-700, 700-1400 and 1400-2100</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_10"><head>Figure 6 :</head><label>6</label><figDesc>Figure 6: Moving Average of Article 1 for w = 5, 7, 9,11, 13</figDesc><graphic coords="18,72.00,618.45,450.75,120.75" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_11"><head>Figure 7 :</head><label>7</label><figDesc>Figure 7: Moving Average of Article 1 for w=3 for the interval 0-700, 700-1400 and 1400-2100</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_12"><head>Figure 8 :</head><label>8</label><figDesc>Figure 8: Moving Average of Article 2 for w=3 for the interval 0-700, 700-1400 and 1400-2100</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_13"><head>Figure 9 : 13 Figure 10 :</head><label>91310</label><figDesc>Figure 9: Moving Average of Article 2 for w = 5, 7, 9,11, 13</figDesc><graphic coords="20,76.25,481.43,442.50,85.50" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_14"><head>Figure 11 :Figure 12 :</head><label>1112</label><figDesc>Figure 11: Moving Average of Article 2 for w=3 for the interval 0-700, 700-1400 and 1400-2100</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_15"><head>Figure 13 :</head><label>13</label><figDesc>Figure 13: The Moving Average of Article 2 for w=3 for the interval 0-700, 700-1400 and 1400-2100</figDesc><graphic coords="22,82.25,283.50,429.75,115.50" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_16"><head>Figure 14 :</head><label>14</label><figDesc>Figure 14: The Moving Average of Article 3 for w=3 for the interval 0-700, 700-1400 and 1400-2100</figDesc><graphic coords="22,76.25,634.15,442.50,103.50" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_17"><head>Figure 15 : 13 Figure 16 :</head><label>151316</label><figDesc>Figure 15: The Moving Average of Article 3 for w = 5, 7, 9,11, 13</figDesc><graphic coords="23,73.25,473.93,448.50,84.75" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_18"><head>Figure 17 :Figure 18 :</head><label>1718</label><figDesc>Figure 17: The Moving Average of Article 3 for w=3 for the interval 0-700, 700-1400 and 1400-2100</figDesc><graphic coords="24,77.38,341.22,440.04,75.05" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_19"><head>Figure 19 :</head><label>19</label><figDesc>Figure 19: The Moving Average of Article 3 for w=3 for the interval 0-700, 700-1400 and 1400-2100</figDesc><graphic coords="25,77.75,253.60,439.20,84.30" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_20"><head>Figure 20 :Figure 21 . 25 Figure 22 :</head><label>20212522</label><figDesc>Figure 20: Exponential smoothing a=0.1 of Article 1 for the interval 0-700, 700-1400, 1400-2148</figDesc><graphic coords="25,72.00,656.42,454.34,80.20" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_21"><head>Figure 23 :Figure 24 25 Figure 25 :</head><label>23242525</label><figDesc>Figure 23: Exponential smoothing a=0.1 of Article 2 for the interval 0-700, 700-1400, 1400-2148</figDesc><graphic coords="26,72.00,659.60,438.15,82.25" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_22"><head>Figure 26 :Figure 27 25 Figure 28 :</head><label>26272528</label><figDesc>Figure 26: Exponential smoothing a=0.1of Article 3 for the interval 0-700, 700-1400, 1400-2148</figDesc><graphic coords="28,77.00,292.43,440.66,68.90" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_23"><head>Figure 29 :Figure 30 :Figure 31</head><label>293031</label><figDesc>Figure 29: Median smoothing w = 3 of Article 1 for the interval 0-700, 700-1400, 1400-2148</figDesc><graphic coords="29,76.68,494.51,441.64,81.25" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_24"><head>Figure 32 :Figure 33</head><label>3233</label><figDesc>Figure 32: Median smoothing w = 3 of Article 1 for the interval 0-700, 700-1400, 1400-2148</figDesc><graphic coords="31,74.38,335.68,446.06,82.25" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_25"><head>Figure 34 :</head><label>34</label><figDesc>Figure 34: Median smoothing w = 15 of Article 1 for the interval 0-700, 700-1400, 1400-2148</figDesc><graphic coords="32,72.00,263.30,458.96,88.45" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_26"><head>Figure49: 2 Figure50:Figure51:</head><label>2</label><figDesc>Figure49: Correlation field for Articles 1 and 2</figDesc><graphic coords="32,72.50,460.91,449.25,110.25" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_27"><head>Figure53:Figure54:</head><label></label><figDesc>Figure53:The column is divided into 3 equal parts and Correlation matrix</figDesc><graphic coords="34,143.50,72.00,104.25,261.10" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_28"><head>Figure 36 :</head><label>36</label><figDesc>Figure 36: Autocorrelation</figDesc><graphic coords="34,75.50,430.91,146.25,97.20" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_29"><head>Figure56:</head><label></label><figDesc>Figure56: The 3-gram usage that starts with a specific letter (Article 1 -blue, Article 2 -red, Article 3 -green)</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_30"><head></head><label></label><figDesc>г д е є ж з и і ї й к л м н о п р с т у ф х ц ч ш щ ь ю г д е є ж з и і ї й к л м н о п р с т у ф х ц ч ш щ ь ю я ||p1-p2|| ||p1-p3|| ||p2-p3||</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0"><head></head><label></label><figDesc></figDesc><graphic coords="7,98.75,109.95,397.43,381.70" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0"><head></head><label></label><figDesc></figDesc><graphic coords="8,111.13,160.54,372.73,502.40" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0"><head></head><label></label><figDesc></figDesc><graphic coords="9,74.00,154.95,222.75,284.95" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0"><head></head><label></label><figDesc></figDesc><graphic coords="9,297.50,109.95,223.50,330.00" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_0"><head>Table 1</head><label>1</label><figDesc>Frequencies of letters appearance in the standard and the studied passages</figDesc><table><row><cell>Letter</cell><cell>Frequency of</cell><cell>The absolute</cell><cell>The absolute</cell><cell>The relative</cell><cell>The relative</cell></row><row><cell></cell><cell>use of</cell><cell>frequency of</cell><cell>frequency of</cell><cell>frequency of</cell><cell>frequency of</cell></row><row><cell></cell><cell>Ukrainian</cell><cell>the letters in</cell><cell>letters in</cell><cell>letters uses in</cell><cell>letters use in</cell></row><row><cell></cell><cell>language</cell><cell>Passage 1</cell><cell>Passage 2</cell><cell>Passage 1</cell><cell>Passage 2</cell></row><row><cell></cell><cell>letters</cell><cell></cell><cell></cell><cell></cell><cell></cell></row><row><cell>ф</cell><cell>0.003</cell><cell>1</cell><cell>0</cell><cell>0.00</cell><cell>0.00</cell></row><row><cell>щ</cell><cell>0.004</cell><cell>3</cell><cell>1</cell><cell>0.01</cell><cell>0.00</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_1"><head>Table 2</head><label>2</label><figDesc>Example of author speech coefficients calculations</figDesc><table><row><cell>Coefficient</cell><cell>Incoming data</cell><cell>Calculation</cell></row><row><cell>Lexical diversity: Kl=W/N</cell><cell>W=184; N=295</cell><cell>Kl=0.6237</cell></row><row><cell cols="3">Speech connectivity: Kz=(Z+S)/(3*P) Z=20; S=28; P=18 Kz=0.8889</cell></row><row><cell>Syntactic complexity: Ks=1-P/W</cell><cell>P=18; W=184</cell><cell>Ks=0.9022</cell></row><row><cell>Concentration index: Ikt=W10/W</cell><cell>W10=2; W=184</cell><cell>Ikt=0.0109</cell></row><row><cell>Exclusivity index: Iwt=W1/W</cell><cell cols="2">W1=141; W=184 Iwt=0.7663</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_2"><head>Table 3</head><label>3</label><figDesc>Values of parameters for the analyzed Articles 1-3</figDesc><table><row><cell>Parameters</cell><cell cols="3">Article 1 Article 2 Article 3</cell></row><row><cell>Total characters in plain text</cell><cell>29967</cell><cell>32570</cell><cell>37062</cell></row><row><cell>Total characters in the raw text</cell><cell>39792</cell><cell>39663</cell><cell>47084</cell></row><row><cell>Total words</cell><cell>5475</cell><cell>5358</cell><cell>6060</cell></row><row><cell cols="2">Total N-grams found (with repetition) 29494</cell><cell>29862</cell><cell>36383</cell></row><row><cell>Total N-grams found (no iterations)</cell><cell>4354</cell><cell>4377</cell><cell>3890</cell></row><row><cell>Total N-gram</cell><cell>35937</cell><cell>35937</cell><cell>35937</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_3"><head>Table 4</head><label>4</label><figDesc></figDesc><table><row><cell cols="4">Distribution of frequencies of 1-gram in Articles 1-3</cell><cell></cell><cell></cell><cell></cell></row><row><cell>1 gram</cell><cell>N1</cell><cell>N2</cell><cell>N3</cell><cell>P1</cell><cell>P2</cell><cell>P3</cell></row><row><cell>о</cell><cell>2824</cell><cell>0.094240</cell><cell>2472</cell><cell>0.075898</cell><cell>3870</cell><cell>0.103601</cell></row><row><cell>н</cell><cell>2471</cell><cell>0.082460</cell><cell>2370</cell><cell>0.072766</cell><cell>2888</cell><cell>0.077312</cell></row><row><cell>а</cell><cell>2255</cell><cell>0.075252</cell><cell>2698</cell><cell>0.082837</cell><cell>2491</cell><cell>0.066685</cell></row><row><cell>т</cell><cell>2102</cell><cell>0.070146</cell><cell>1956</cell><cell>0.060055</cell><cell>2141</cell><cell>0.057315</cell></row><row><cell>і</cell><cell>1789</cell><cell>0.059701</cell><cell>1967</cell><cell>0.060393</cell><cell>2250</cell><cell>0.060233</cell></row><row><cell>и</cell><cell>1732</cell><cell>0.057799</cell><cell>1852</cell><cell>0.056862</cell><cell>2036</cell><cell>0.054504</cell></row><row><cell>в</cell><cell>1654</cell><cell>0.055196</cell><cell>1590</cell><cell>0.048818</cell><cell>1915</cell><cell>0.051265</cell></row><row><cell>с</cell><cell>1549</cell><cell>0.051692</cell><cell>1327</cell><cell>0.040743</cell><cell>1384</cell><cell>0.037050</cell></row><row><cell>е</cell><cell>1404</cell><cell>0.046853</cell><cell>1453</cell><cell>0.044612</cell><cell>2090</cell><cell>0.055950</cell></row><row><cell>р</cell><cell>1335</cell><cell>0.044550</cell><cell>1722</cell><cell>0.052871</cell><cell>1893</cell><cell>0.050676</cell></row><row><cell>к</cell><cell>1279</cell><cell>0.042682</cell><cell>1110</cell><cell>0.034080</cell><cell>1453</cell><cell>0.038897</cell></row><row><cell>л</cell><cell>1116</cell><cell>0.037242</cell><cell>927</cell><cell>0.028462</cell><cell>906</cell><cell>0.024254</cell></row><row><cell>у</cell><cell>987</cell><cell>0.032937</cell><cell>960</cell><cell>0.029475</cell><cell>1195</cell><cell>0.031990</cell></row><row><cell>д</cell><cell>859</cell><cell>0.028666</cell><cell>939</cell><cell>0.028830</cell><cell>1319</cell><cell>0.035310</cell></row><row><cell>м</cell><cell>808</cell><cell>0.026964</cell><cell>976</cell><cell>0.029966</cell><cell>1399</cell><cell>0.037451</cell></row><row><cell>п</cell><cell>647</cell><cell>0.021591</cell><cell>825</cell><cell>0.025330</cell><cell>1138</cell><cell>0.030464</cell></row><row><cell>я</cell><cell>647</cell><cell>0.021591</cell><cell>681</cell><cell>0.020909</cell><cell>864</cell><cell>0.023129</cell></row><row><cell>з</cell><cell>623</cell><cell>0.020790</cell><cell>644</cell><cell>0.019773</cell><cell>946</cell><cell>0.025325</cell></row><row><cell>ь</cell><cell>498</cell><cell>0.016619</cell><cell>418</cell><cell>0.012834</cell><cell>613</cell><cell>0.016410</cell></row><row><cell>ч</cell><cell>459</cell><cell>0.015317</cell><cell>289</cell><cell>0.008873</cell><cell>574</cell><cell>0.015366</cell></row><row><cell>г</cell><cell>408</cell><cell>0.013615</cell><cell>373</cell><cell>0.011452</cell><cell>651</cell><cell>0.017427</cell></row><row><cell>х</cell><cell>355</cell><cell>0.011847</cell><cell>384</cell><cell>0.011790</cell><cell>482</cell><cell>0.012903</cell></row><row><cell>б</cell><cell>284</cell><cell>0.009477</cell><cell>569</cell><cell>0.017470</cell><cell>428</cell><cell>0.011458</cell></row><row><cell>ж</cell><cell>246</cell><cell>0.008209</cell><cell>210</cell><cell>0.006448</cell><cell>176</cell><cell>0.004712</cell></row><row><cell>й</cell><cell>239</cell><cell>0.007976</cell><cell>260</cell><cell>0.007983</cell><cell>265</cell><cell>0.007094</cell></row><row><cell>ц</cell><cell>224</cell><cell>0.007475</cell><cell>334</cell><cell>0.010255</cell><cell>299</cell><cell>0.008004</cell></row><row><cell>є</cell><cell>188</cell><cell>0.006274</cell><cell>165</cell><cell>0.005066</cell><cell>347</cell><cell>0.009289</cell></row><row><cell>ф</cell><cell>179</cell><cell>0.005973</cell><cell>209</cell><cell>0.006417</cell><cell>137</cell><cell>0.003668</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_4"><head>Table 5</head><label>5</label><figDesc>Frequencies of letters appearance in the standard and the studied passages 0.000366529 0.000339199 0.000392978</figDesc><table><row><cell>Standard error</cell><cell cols="3">1.28793E-05 1.24565E-05 1.53165E-05</cell></row><row><cell>Median</cell><cell>0.000167</cell><cell>0.000154</cell><cell>0.000162</cell></row><row><cell>Fashion</cell><cell>0.000033</cell><cell>0.000031</cell><cell>0.000027</cell></row><row><cell>Standard deviation</cell><cell cols="3">0.000596773 0.00057718 0.000709699</cell></row><row><cell>Sampling variance</cell><cell cols="3">3.56138E-07 3.33136E-07 5.03673E-07</cell></row><row><cell>Kurtosis</cell><cell cols="3">37.42530062 32.63050249 29.5089837</cell></row><row><cell>Asymmetry</cell><cell cols="2">4.881688545 4.62453506</cell><cell>4.54877741</cell></row><row><cell>Interval</cell><cell>0.008443</cell><cell>0.006417</cell><cell>0.008742</cell></row><row><cell>Minimum</cell><cell>0.000033</cell><cell>0.000031</cell><cell>0.000027</cell></row><row><cell>Maximum</cell><cell>0.008476</cell><cell>0.006448</cell><cell>0.008769</cell></row><row><cell>Sum</cell><cell>0.786938</cell><cell>0.72826</cell><cell>0.843723</cell></row><row><cell>Amount</cell><cell>2147</cell><cell>2147</cell><cell>2147</cell></row><row><cell cols="2">Reliability level (95.0%) 2.52573E-05</cell><cell>2.4428E-05</cell><cell>3.00366E-05</cell></row></table></figure>
		</body>
		<back>
			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">Chastotypovtoryuvanosti bukv i bihram u vidkrytykh tekstakh ukrayinsʹkoyumovoyu [Frequency of repetition of letters and digrams in open texts in Ukrainian</title>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">O</forename><surname>Sushko</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><forename type="middle">Y</forename><surname>Fomychova</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><forename type="middle">S</forename><surname>Barsukov</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Protection of information</title>
		<imprint>
			<biblScope unit="volume">12</biblScope>
			<biblScope unit="issue">3</biblScope>
			<date type="published" when="2010">2010</date>
		</imprint>
	</monogr>
	<note>Zakhyst informatsiyi</note>
</biblStruct>

<biblStruct xml:id="b1">
	<monogr>
		<title level="m" type="main">Authorship definition based on the frequency distribution of letter combinations</title>
		<author>
			<persName><forename type="first">P</forename><forename type="middle">S</forename><surname>Dyurdeva</surname></persName>
		</author>
		<ptr target="https://www.math.spbu.ru/SD_AIS/documents/2015-12-441/2015-12-b-07.pdf" />
		<imprint>
			<date type="published" when="2015">2015</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title level="a" type="main">The global k-means clustering algorithm</title>
		<author>
			<persName><forename type="first">A</forename><surname>Likasa</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Vlassisb</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">J</forename><surname>Verbeekb</surname></persName>
		</author>
		<ptr target="https://www.cs.uoi.gr/~arly/papers/PR2003.pdf" />
	</analytic>
	<monogr>
		<title level="j">Pattern Recognition</title>
		<imprint>
			<biblScope unit="volume">36</biblScope>
			<biblScope unit="issue">2</biblScope>
			<biblScope unit="page" from="451" to="461" />
			<date type="published" when="2003">2003</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">The Application of K-medoids and PAM to the Clustering of Rules</title>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">P</forename><surname>Reynolds</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Richards</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><forename type="middle">J</forename><surname>Rayward-Smith</surname></persName>
		</author>
		<idno type="DOI">10.1007/978-3-540-28651-6_25</idno>
	</analytic>
	<monogr>
		<title level="j">Lecture Notes in Computer Science</title>
		<imprint>
			<biblScope unit="volume">3177</biblScope>
			<date type="published" when="2004">2004</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title level="a" type="main">Criterial analysis of gene expression sequences to create the objective clustering inductive technology</title>
		<author>
			<persName><forename type="first">S</forename><surname>Babichev</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">A</forename><surname>Taif</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Lytvynenko</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Osypenko</surname></persName>
		</author>
		<idno type="DOI">10.1109/ELNANO.2017.7939756</idno>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the International Conference on Electronics and Nanotechnology</title>
				<meeting>the International Conference on Electronics and Nanotechnology</meeting>
		<imprint>
			<publisher>ELNANO</publisher>
			<date type="published" when="2017">2017</date>
			<biblScope unit="page" from="244" to="248" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">An Evaluation of the Objective Clustering Inductive Technology Effectiveness Implemented Using Density-Based and Agglomerative Hierarchical Clustering Algorithms</title>
		<author>
			<persName><forename type="first">S</forename><surname>Babichev</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Durnyak</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Pikh</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Senkivskyy</surname></persName>
		</author>
		<idno type="DOI">10.1007/978-3-030-26474-1_37</idno>
	</analytic>
	<monogr>
		<title level="j">Advances in Intelligent Systems and Computing</title>
		<imprint>
			<biblScope unit="page" from="532" to="553" />
			<date type="published" when="1020">1020. 2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<analytic>
		<title level="a" type="main">Objective clustering inductive technology of gene expression profiles based on SOTA clustering algorithm</title>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">A</forename><surname>Babichev</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Gozhyj</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">I</forename><surname>Kornelyuk</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><forename type="middle">I</forename><surname>Lytvynenko</surname></persName>
		</author>
		<idno type="DOI">10.7124/bc.000961</idno>
	</analytic>
	<monogr>
		<title level="j">Biopolymers and Cell</title>
		<imprint>
			<biblScope unit="volume">33</biblScope>
			<biblScope unit="issue">5</biblScope>
			<biblScope unit="page" from="379" to="392" />
			<date type="published" when="2017">2017</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<monogr>
		<title level="m" type="main">Means Clustering</title>
		<author>
			<persName><forename type="first">K-</forename></persName>
		</author>
		<ptr target="https://people.revoledu.com/kardi/tutorial/kMean/#google_vignette" />
		<imprint/>
	</monogr>
</biblStruct>

<biblStruct xml:id="b8">
	<monogr>
		<title level="m" type="main">Methodology and software package for identifying the author of an unknown text</title>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">S</forename><surname>Romanov</surname></persName>
		</author>
		<ptr target="https://www.dissercat.com/content/metodika-i-programmnyi-kompleks-dlya-identifikatsii-avtora-neizvestnogo-teksta" />
		<imprint>
			<date type="published" when="2010">2010</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b9">
	<analytic>
		<title level="a" type="main">Identification of the author of the text by the frequency distribution of letter combinations</title>
		<author>
			<persName><forename type="first">L</forename><forename type="middle">A</forename><surname>Borisov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Yu</forename><forename type="middle">N</forename><surname>Orlov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><forename type="middle">P</forename><surname>Osminin</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Applied Informatics</title>
		<imprint>
			<biblScope unit="volume">26</biblScope>
			<biblScope unit="issue">2</biblScope>
			<biblScope unit="page" from="95" to="108" />
			<date type="published" when="2013">2013</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b10">
	<monogr>
		<title level="m" type="main">Determining the genre and author of a literary work by statistical methods</title>
		<author>
			<persName><forename type="middle">N</forename><surname>Yu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><forename type="middle">P</forename><surname>Orlov</surname></persName>
		</author>
		<author>
			<persName><surname>Osminin</surname></persName>
		</author>
		<ptr target="https://keldysh.ru/papers/2013/prep2013_27.pdf" />
		<imprint>
			<date type="published" when="2010">2010</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b11">
	<monogr>
		<title level="m" type="main">Methods of statistical analysis of literary texts</title>
		<author>
			<persName><forename type="middle">N</forename><surname>Yu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><forename type="middle">P</forename><surname>Orlov</surname></persName>
		</author>
		<author>
			<persName><surname>Osminin</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2012">2012</date>
			<publisher>LIBROKOM</publisher>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b12">
	<analytic>
		<title level="a" type="main">Clustering of Russian Manuscripts Based on the Feature Relationship Graph</title>
		<author>
			<persName><forename type="first">V</forename><forename type="middle">A</forename><surname>Pavlov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><forename type="middle">S</forename><surname>Dyurdeva</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><forename type="middle">S</forename><surname>Shalymov</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Computer tools in education</title>
		<imprint>
			<biblScope unit="volume">1</biblScope>
			<biblScope unit="page" from="24" to="35" />
			<date type="published" when="2016">2016</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b13">
	<analytic>
		<title level="a" type="main">Formal methods for determining the authorship of texts</title>
		<author>
			<persName><forename type="first">T</forename><forename type="middle">V</forename><surname>Batura</surname></persName>
		</author>
		<ptr target="https://cyberleninka.ru/article/n/formalnye-metody-opredeleniya-avtorstva-tekstov" />
	</analytic>
	<monogr>
		<title level="j">NGU Bulletin of Series Information technologies</title>
		<imprint>
			<biblScope unit="volume">10</biblScope>
			<biblScope unit="issue">4</biblScope>
			<date type="published" when="2012">2012</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b14">
	<monogr>
		<title level="m" type="main">Intro to Natural Language Processing</title>
		<author>
			<persName><forename type="first">M</forename><surname>Romanyshyn</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2017">2017</date>
			<publisher>Grammarly, Inc</publisher>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b15">
	<monogr>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">V</forename><surname>Babash</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><forename type="middle">P</forename><surname>Shankin</surname></persName>
		</author>
		<ptr target="https://pub.flowpaper.com/docs/https://book.edu-lib.net/books1/Babash_Kriprografiya_2.pdf" />
		<title level="m">Cryptography, SOLON-R</title>
				<imprint>
			<date type="published" when="2002">2002</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b16">
	<monogr>
		<title level="m" type="main">Fundamentals of cryptography</title>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">P</forename><surname>Alferov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Yu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">S</forename><surname>Zubov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">V</forename><surname>Kuzmin</surname></persName>
		</author>
		<author>
			<persName><surname>Cheryomushki</surname></persName>
		</author>
		<ptr target="https://studfile.net/preview/6311470/" />
		<imprint>
			<date type="published" when="2002">2002</date>
			<pubPlace>Helios</pubPlace>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b17">
	<analytic>
		<title level="a" type="main">Probability and information</title>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">M</forename><surname>Yaglom</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><forename type="middle">M</forename><surname>Yaglom</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Phys.-Math. lit</title>
		<imprint>
			<date type="published" when="1973">1973</date>
		</imprint>
	</monogr>
	<note>Science</note>
</biblStruct>

<biblStruct xml:id="b18">
	<monogr>
		<title level="m" type="main">Information measurements of language</title>
		<author>
			<persName><forename type="first">R</forename><forename type="middle">G</forename><surname>Piotrovsky</surname></persName>
		</author>
		<imprint>
			<date type="published" when="1968">1968</date>
			<publisher>Nauka</publisher>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b19">
	<analytic>
		<title level="a" type="main">Information theory and linguistics</title>
		<author>
			<persName><forename type="first">I</forename><forename type="middle">M</forename><surname>Yaglom</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><forename type="middle">L</forename><surname>Dobrushin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">M</forename><surname>Yaglom</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Questions of linguistics</title>
		<imprint>
			<biblScope unit="volume">1</biblScope>
			<biblScope unit="page" from="100" to="110" />
			<date type="published" when="1960">1960</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b20">
	<analytic>
		<title level="a" type="main">On the possibility of increasing the speed of transmission of telegraph messages</title>
		<author>
			<persName><forename type="first">D</forename><forename type="middle">S</forename><surname>Lebedev</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><forename type="middle">A</forename><surname>Garmash</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Telecommunications</title>
		<imprint>
			<biblScope unit="volume">1</biblScope>
			<biblScope unit="page" from="68" to="69" />
			<date type="published" when="1958">1958</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b21">
	<monogr>
		<title level="m" type="main">Prediction and entropy of the printed English</title>
		<author>
			<persName><forename type="first">C</forename><forename type="middle">E</forename><surname>Shannon</surname></persName>
		</author>
		<ptr target="https://www.princeton.edu/~wbialek/rome/refs/shannon_51.pdf" />
		<imprint>
			<date type="published" when="1951">1951</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b22">
	<analytic>
		<title level="a" type="main">Vstup do kryptolohiyi [Introduction to cryptology</title>
		<author>
			<persName><forename type="first">O</forename><forename type="middle">V</forename><surname>Verbitskyy</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Vydavnytstvo Naukovotekhnichnoyi literatury</title>
				<meeting><address><addrLine>Lviv</addrLine></address></meeting>
		<imprint>
			<publisher>Publishing House of Scientific and Technical Literature</publisher>
			<date type="published" when="1998">1998</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b23">
	<analytic>
		<title level="a" type="main">Chastotni slovnyky ta yikhvykorystannya [Frequency dictionaries and their use</title>
		<author>
			<persName><forename type="first">V</forename><forename type="middle">I</forename><surname>Perebyynis</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">P</forename><surname>Muravytska</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><forename type="middle">P</forename><surname>Darchuk</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Naukova dumka</title>
				<imprint>
			<date type="published" when="1983">1983</date>
		</imprint>
	</monogr>
	<note>Scientific opinion</note>
</biblStruct>

<biblStruct xml:id="b24">
	<analytic>
		<title level="a" type="main">Development of methods, models, and means for the author attribution of a text</title>
		<author>
			<persName><forename type="first">I</forename><surname>Khomytska</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Teslyuk</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Holovatyy</surname></persName>
		</author>
		<author>
			<persName><forename type="first">O</forename><surname>Morushko</surname></persName>
		</author>
		<idno type="DOI">10.15587/1729-4061.2018.132052</idno>
	</analytic>
	<monogr>
		<title level="j">Eastern-European Journal of Enterprise Technologies</title>
		<imprint>
			<biblScope unit="volume">3</biblScope>
			<biblScope unit="issue">2</biblScope>
			<biblScope unit="page" from="41" to="46" />
			<date type="published" when="2018">2018</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b25">
	<analytic>
		<title level="a" type="main">Authorship and Style Attribution by Statistical Methods of Style Differentiation on the Phonological Level</title>
		<author>
			<persName><forename type="first">I</forename><surname>Khomytska</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Teslyuk</surname></persName>
		</author>
		<idno type="DOI">10.1007/978-3-030-01069-0_8</idno>
	</analytic>
	<monogr>
		<title level="j">Advances in Intelligent Systems and Computing</title>
		<imprint>
			<biblScope unit="volume">871</biblScope>
			<biblScope unit="page" from="105" to="118" />
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b26">
	<analytic>
		<title level="a" type="main">Software-Based Approach Towards Automated Authorship Acknowledgement -Chi-Square Test on One Consonant Group</title>
		<author>
			<persName><forename type="first">I</forename><surname>Khomytska</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Teslyuk</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Kryvinska</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Bazylevych</surname></persName>
		</author>
		<idno type="DOI">10.3390/electronics9071138</idno>
	</analytic>
	<monogr>
		<title level="j">Electronics</title>
		<imprint>
			<biblScope unit="volume">9</biblScope>
			<biblScope unit="issue">7</biblScope>
			<biblScope unit="page">1138</biblScope>
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b27">
	<analytic>
		<title level="a" type="main">Approach for Minimization of Phoneme Groups in Authorship Attribution Attribution</title>
		<author>
			<persName><forename type="first">I</forename><surname>Khomytska</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Teslyuk</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Bazylevych</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Shylinska</surname></persName>
		</author>
		<idno type="DOI">10.47839/IJC.19.1.1693</idno>
	</analytic>
	<monogr>
		<title level="j">International Journal of Computing</title>
		<imprint>
			<biblScope unit="volume">19</biblScope>
			<biblScope unit="issue">1</biblScope>
			<biblScope unit="page" from="55" to="62" />
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b28">
	<monogr>
		<author>
			<persName><forename type="first">D</forename><surname>Jurafsky</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">H</forename><surname>Martin</surname></persName>
		</author>
		<ptr target="https://web.stanford.edu/~jurafsky/slp3/3.pdf" />
		<title level="m">N-gram Language Models</title>
				<imprint/>
	</monogr>
</biblStruct>

<biblStruct xml:id="b29">
	<monogr>
		<title level="m" type="main">Speech and Language Processing</title>
		<author>
			<persName><forename type="first">D</forename><surname>Jurafsky</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">H</forename><surname>Martin</surname></persName>
		</author>
		<ptr target="https://web.stanford.edu/~jurafsky/slp3/ed3book_sep212021.pdf" />
		<imprint/>
	</monogr>
</biblStruct>

<biblStruct xml:id="b30">
	<monogr>
		<title level="m" type="main">Regular Expressions, Text Normalization, Edit Distance</title>
		<author>
			<persName><forename type="first">D</forename><surname>Jurafsky</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">H</forename><surname>Martin</surname></persName>
		</author>
		<ptr target="https://web.stanford.edu/~jurafsky/slp3/2.pdf" />
		<imprint/>
	</monogr>
</biblStruct>

<biblStruct xml:id="b31">
	<analytic>
		<title level="a" type="main">Domain knowledge query conversation bots in instant messaging (IM)</title>
		<author>
			<persName><forename type="first">O</forename><forename type="middle">S</forename><surname>Goh</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><forename type="middle">C</forename><surname>Fung</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Depickere</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Knowledge-Based Systems</title>
		<imprint>
			<biblScope unit="volume">21</biblScope>
			<biblScope unit="issue">7</biblScope>
			<biblScope unit="page" from="681" to="691" />
			<date type="published" when="2008">2008</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b32">
	<monogr>
		<author>
			<persName><forename type="first">S</forename><surname>Buk</surname></persName>
		</author>
		<title level="m">Osnovy statystychnoy lingvistyky, LNU n. I</title>
				<imprint>
			<publisher>Franko Publishing House</publisher>
			<date type="published" when="2008">2008</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b33">
	<analytic>
		<title level="a" type="main">Method for Determining Linguometric Coefficient Dynamics of Ukrainian Text Content Authorship</title>
		<author>
			<persName><forename type="first">V</forename><surname>Vysotska</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><forename type="middle">B</forename><surname>Fernandes</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Lytvyn</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Emmerich</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Hrendus</surname></persName>
		</author>
		<idno type="DOI">10.1007/978-3-030-01069-0_10</idno>
	</analytic>
	<monogr>
		<title level="j">Advances in Intelligent Systems and Computing</title>
		<imprint>
			<biblScope unit="volume">871</biblScope>
			<biblScope unit="page" from="132" to="151" />
			<date type="published" when="2018">2018</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b34">
	<analytic>
		<title level="a" type="main">The control agent with fuzzy logic</title>
		<author>
			<persName><forename type="first">P</forename><surname>Kravets</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the International Conference on Perspective Technologies and Methods in MEMS Design</title>
				<meeting>the International Conference on Perspective Technologies and Methods in MEMS Design<address><addrLine>Lviv, Ukraine</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2010">2010</date>
			<biblScope unit="page" from="40" to="41" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b35">
	<analytic>
		<title level="a" type="main">The Game Method for Orthonormal Systems Construction</title>
		<author>
			<persName><forename type="first">P</forename><surname>Kravets</surname></persName>
		</author>
		<idno type="DOI">10.1109/cadsm.2007.4297555</idno>
		<idno>doi:</idno>
		<ptr target="https://doi.org/10.1109/cadsm.2007.4297555" />
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 2007 9th International Conference -The Experience of Designing and Applications of CAD Systems in Microelectronics</title>
				<meeting>the 2007 9th International Conference -The Experience of Designing and Applications of CAD Systems in Microelectronics<address><addrLine>Lviv, Ukraine</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2007">2007</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b36">
	<analytic>
		<title level="a" type="main">Game Model of Dragonfly Animat Self-Learning</title>
		<author>
			<persName><forename type="first">P</forename><surname>Kravets</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the International Conference on Perspective Technologies and Methodsin MEMS Design</title>
				<meeting>the International Conference on Perspective Technologies and Methodsin MEMS Design<address><addrLine>Lviv</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2016">2016</date>
			<biblScope unit="page" from="195" to="201" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b37">
	<analytic>
		<title level="a" type="main">Development of the quantitative method for automated text content authorship attribution based on the statistical analysis of N-grams distribution</title>
		<author>
			<persName><forename type="first">V</forename><surname>Lytvyn</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Vysotska</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Budz</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Pelekh</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Sokulska</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Kovalchuk</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Dzyubyk</surname></persName>
		</author>
		<author>
			<persName><forename type="first">O</forename><surname>Tereshchuk</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Komar</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Eastern-European Journal of Enterprise Technologies</title>
		<imprint>
			<biblScope unit="volume">6</biblScope>
			<biblScope unit="page" from="28" to="51" />
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b38">
	<analytic>
		<title level="a" type="main">Recommendation System Development Based on Intelligent Search NLP and Machine Learning Methods</title>
		<author>
			<persName><forename type="first">I</forename><surname>Balush</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Vysotska</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Albota</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">CEUR WorkshopProceedings</title>
		<imprint>
			<biblScope unit="volume">2917</biblScope>
			<biblScope unit="page" from="584" to="617" />
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b39">
	<analytic>
		<title level="a" type="main">The Text Classification Based on Big Data Analysis for Keyword Definition Using Stemming</title>
		<author>
			<persName><forename type="first">A</forename><surname>Berko</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Matseliukh</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Ivaniv</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Chyrun</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Schuchmann</surname></persName>
		</author>
		<idno type="DOI">10.1109/CSIT52700.2021.9648764</idno>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 2021 IEEE 16th International Conference on Computer Sciences and Information Technologies (CSIT), 1</title>
				<meeting>the 2021 IEEE 16th International Conference on Computer Sciences and Information Technologies (CSIT), 1<address><addrLine>Lviv, Ukraine</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2021">2021</date>
			<biblScope unit="page" from="184" to="188" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b40">
	<analytic>
		<title level="a" type="main">The Method of Text Tonality Classification</title>
		<author>
			<persName><forename type="first">N</forename><surname>Shakhovska</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Shakhovska</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the Computer Sciences and Information Technologies (CSIT), 1</title>
				<meeting>the Computer Sciences and Information Technologies (CSIT), 1<address><addrLine>Lviv, Ukraine</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2020">2020</date>
			<biblScope unit="page" from="19" to="23" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b41">
	<analytic>
		<title level="a" type="main">The Kolmogorov-Smirnov&apos;s Test for Authorship Attribution on the Phonological Level</title>
		<author>
			<persName><forename type="first">I</forename><surname>Khomytska</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Teslyuk</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Bordyuk</surname></persName>
		</author>
		<idno type="DOI">10.1109/CSIT49958.2020.9322042</idno>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the Computer Sciences and Information Technologies (CSIT)</title>
				<meeting>the Computer Sciences and Information Technologies (CSIT)</meeting>
		<imprint>
			<date type="published" when="2020">2020</date>
			<biblScope unit="volume">1</biblScope>
			<biblScope unit="page" from="259" to="262" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b42">
	<analytic>
		<title level="a" type="main">Language-independent features for authorship attribution on Ukrainian texts</title>
		<author>
			<persName><forename type="first">Y</forename><surname>Hlavcheva</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Bobicev</surname></persName>
		</author>
		<author>
			<persName><forename type="first">O</forename><surname>Kanishcheva</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="s">CEUR Workshop Proceedings</title>
		<imprint>
			<biblScope unit="volume">2833</biblScope>
			<biblScope unit="page" from="134" to="143" />
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b43">
	<analytic>
		<title level="a" type="main">Precision Automated Phonetic Analysis of Speech Signals for Information Technology of Text-dependent Authentication of a Person by Voice</title>
		<author>
			<persName><forename type="first">O</forename><surname>Bisikalo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">O</forename><surname>Boivan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Khairova</surname></persName>
		</author>
		<author>
			<persName><forename type="first">O</forename><forename type="middle">V</forename><surname>Kovtun</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Kovtun</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="s">CEUR Workshop Proceedings</title>
		<imprint>
			<biblScope unit="volume">2853</biblScope>
			<biblScope unit="page" from="276" to="288" />
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b44">
	<analytic>
		<title level="a" type="main">Implicit Visual Attention Feedback System for Wikipedia Users</title>
		<author>
			<persName><forename type="first">N</forename><surname>Dubey</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">A</forename><surname>Verma</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">R S</forename><surname>Iyengar</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Setia</surname></persName>
		</author>
		<idno type="DOI">.10.1145/3479986.3479993</idno>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 17th International Symposium on Open Collaboration</title>
				<meeting>the 17th International Symposium on Open Collaboration<address><addrLine>NY, USA</addrLine></address></meeting>
		<imprint>
			<publisher>Association for Computing Machinery</publisher>
			<date type="published" when="2021">2021</date>
			<biblScope unit="page" from="1" to="11" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b45">
	<analytic>
		<title level="a" type="main">The lexical innovations identification in English-languagee eurointegration discourse for the goods analysis by comments in e-commerce resources</title>
		<author>
			<persName><forename type="first">V</forename><surname>Lytvyn</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Danylyk</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Bublyk</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Chyrun</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Panasyuk</surname></persName>
		</author>
		<author>
			<persName><forename type="first">O</forename><surname>Korolenko</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 2021 IEEE 16th International Conference on Computer Sciences and Information Technologies</title>
				<meeting>the 2021 IEEE 16th International Conference on Computer Sciences and Information Technologies<address><addrLine>Lviv</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2021">2021</date>
			<biblScope unit="page" from="85" to="97" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b46">
	<analytic>
		<title level="a" type="main">Intelligent system for film script formation based on artbook text and Big Data analysis</title>
		<author>
			<persName><forename type="first">O</forename><surname>Hladun</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Berko</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Bublyk</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Chyrun</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Schuchmann</surname></persName>
		</author>
		<idno type="DOI">10.1109/CSIT52700.2021.9648682</idno>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 2021 IEEE 16th International Conference on Computer Sciences and Information Technologies (CSIT)</title>
				<meeting>the 2021 IEEE 16th International Conference on Computer Sciences and Information Technologies (CSIT)<address><addrLine>Lviv, Ukraine</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2021-09-25">22-25 September, 2021</date>
			<biblScope unit="page" from="138" to="146" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b47">
	<analytic>
		<title level="a" type="main">The user&apos;s psychological state identification based on Big Data analysis for person&apos;s electronic diary</title>
		<author>
			<persName><forename type="first">A</forename><surname>Dyriv</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Andrunyk</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Burov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Karpov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Chyrun</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Computer Sciences and Information Technologies (CSIT)</title>
				<meeting><address><addrLine>Lviv, Ukraine</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2021-09-25">22-25 September, 2021</date>
			<biblScope unit="page" from="101" to="112" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b48">
	<analytic>
		<title level="a" type="main">Corpus Technologies in Translation Studies: Fiction as Document</title>
		<author>
			<persName><forename type="first">N</forename><surname>Hrytsiv</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Shestakevych</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Shyyka</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">CEUR Workshop Proceedings</title>
		<imprint>
			<biblScope unit="page" from="327" to="343" />
			<date type="published" when="2021">2917. 2021</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b49">
	<analytic>
		<title level="a" type="main">Development of a Speech-to-Text Program for People with Haring Impairments</title>
		<author>
			<persName><forename type="first">D</forename><surname>Koshtura</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Andrunyk</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Shestakevych</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">CEUR Workshop Proceedings</title>
		<imprint>
			<biblScope unit="volume">2917</biblScope>
			<biblScope unit="page" from="565" to="583" />
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b50">
	<analytic>
		<title level="a" type="main">Intelligent System for Checking the Authenticity of Goods Based on Blockchain Technology</title>
		<author>
			<persName><forename type="first">O</forename><surname>Prokipchuk</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Chyrun</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Bublyk</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Panasyuk</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Yakimtsov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Kovalchuk</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">CEUR Workshop Proceedings</title>
		<imprint>
			<biblScope unit="volume">2917</biblScope>
			<biblScope unit="page" from="618" to="665" />
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b51">
	<analytic>
		<title level="a" type="main">Uniform method of operative content management in web systems</title>
		<author>
			<persName><forename type="first">A</forename><surname>Gozhyj</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Chyrun</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Kowalska-Styczen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">O</forename><surname>Lozynska</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">CEUR Workshop Proceedings</title>
		<imprint>
			<biblScope unit="volume">2136</biblScope>
			<biblScope unit="page" from="62" to="77" />
			<date type="published" when="2018">2018</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b52">
	<analytic>
		<title level="a" type="main">Heterogeneous data with agreed content aggregation system development</title>
		<author>
			<persName><forename type="first">L</forename><surname>Chyrun</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Kowalska-Styczen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Burov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Berko</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Vasevych</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Pelekh</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Ryshkovets</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">CEUR Workshop Proceedings</title>
		<imprint>
			<biblScope unit="volume">2386</biblScope>
			<biblScope unit="page" from="35" to="54" />
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b53">
	<analytic>
		<title level="a" type="main">The mobile application development based on online music library for socializing in the world of bard songs and scouts&apos; bonfires</title>
		<author>
			<persName><forename type="first">B</forename><surname>Rusyn</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Pohreliuk</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Rzheuskyi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Kubik</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Ryshkovets</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Chyrun</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Chyrun</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Vysotskyi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><forename type="middle">B</forename><surname>Fernandes</surname></persName>
		</author>
		<idno type="DOI">10.1007/978-3-030-33695-0_49</idno>
	</analytic>
	<monogr>
		<title level="j">Advances in Intelligent Systems and Computing</title>
		<imprint>
			<biblScope unit="volume">1080</biblScope>
			<biblScope unit="page" from="734" to="756" />
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b54">
	<analytic>
		<title level="a" type="main">Web Content Monitoring System Development</title>
		<author>
			<persName><forename type="first">L</forename><surname>Chyrun</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Gozhyj</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Yevseyeva</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Dosyn</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Tyhonov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Zakharchuk</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">CEUR Workshop Proceedings</title>
		<imprint>
			<biblScope unit="volume">2362</biblScope>
			<biblScope unit="page" from="126" to="142" />
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b55">
	<analytic>
		<title level="a" type="main">Medical news aggregation and ranking of taking into account the user needs</title>
		<author>
			<persName><forename type="first">N</forename><surname>Antonyuk</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Chyrun</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Andrunyk</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Vasevych</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Chyrun</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Gozhyj</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Kalinina</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Borzov</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">CEUR Workshop Proceedings</title>
		<imprint>
			<biblScope unit="volume">2488</biblScope>
			<biblScope unit="page" from="369" to="382" />
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b56">
	<analytic>
		<title level="a" type="main">Online Tourism System Development for Searching and Planning Trips with User&apos;s Requirements</title>
		<author>
			<persName><forename type="first">N</forename><surname>Antonyuk</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Medykovskyy</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Chyrun</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Dverii</surname></persName>
		</author>
		<author>
			<persName><forename type="first">O</forename><surname>Oborska</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Krylyshyn</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Vysotsky</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Tsiura</surname></persName>
		</author>
		<author>
			<persName><forename type="first">O</forename><surname>Naum</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Advances in Intelligent Systems and Computing</title>
		<imprint>
			<biblScope unit="page" from="831" to="863" />
			<date type="published" when="1080">1080. 2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b57">
	<analytic>
		<title level="a" type="main">Development of information system for aggregation and ranking of news taking into account the user needs</title>
		<author>
			<persName><forename type="first">V</forename><surname>Andrunyk</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Vasevych</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Chyrun</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Chernovol</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Antonyuk</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Gozhyj</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Gozhyj</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Kalinina</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Korobchynskyi</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">CEUR Workshop Proceedings</title>
				<imprint>
			<date type="published" when="2020">2020</date>
			<biblScope unit="volume">2604</biblScope>
			<biblScope unit="page" from="1127" to="1171" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b58">
	<analytic>
		<title level="a" type="main">Commercial content distribution system based on neural network and machine learning</title>
		<author>
			<persName><forename type="first">A</forename><surname>Demchuk</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Rusyn</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Pohreliuk</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Gozhyj</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Kalinina</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Chyrun</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Antonyuk</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">CEUR Workshop Proceedings</title>
		<imprint>
			<biblScope unit="volume">2516</biblScope>
			<biblScope unit="page" from="40" to="57" />
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b59">
	<analytic>
		<title level="a" type="main">Design of a system for dynamic integration of weakly structured data based on mash-up technology</title>
		<author>
			<persName><forename type="first">I</forename><surname>Pelekh</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Berko</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Andrunyk</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Chyrun</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Dyyak</surname></persName>
		</author>
		<idno type="DOI">10.1109/DSMP47368.2020.9204160</idno>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the Data Stream Mining and Processing</title>
				<meeting>the Data Stream Mining and Processing</meeting>
		<imprint>
			<date type="published" when="2020">2020</date>
			<biblScope unit="page" from="420" to="425" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b60">
	<analytic>
		<title level="a" type="main">Application of ontologies and meta-models for dynamic integration of weakly structured data</title>
		<author>
			<persName><forename type="first">A</forename><surname>Berko</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Pelekh</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Chyrun</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Bublyk</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Bobyk</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Matseliukh</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Chyrun</surname></persName>
		</author>
		<idno type="DOI">10.1109/DSMP47368.2020.9204321</idno>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the Data Stream Mining and Processing</title>
				<meeting>the Data Stream Mining and Processing</meeting>
		<imprint>
			<date type="published" when="2020">2020</date>
			<biblScope unit="page" from="432" to="437" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b61">
	<analytic>
		<title level="a" type="main">Information resources analysis system of dynamic integration semi-structured data in a web environment</title>
		<author>
			<persName><forename type="first">A</forename><surname>Berko</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Pelekh</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Chyrun</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Dyyak</surname></persName>
		</author>
		<idno type="DOI">10.1109/DSMP47368.2020.9204101</idno>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the Data Stream Mining and Processing</title>
				<meeting>the Data Stream Mining and Processing</meeting>
		<imprint>
			<date type="published" when="2020">2020</date>
			<biblScope unit="page" from="414" to="419" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b62">
	<monogr>
		<title level="m" type="main">Victana Web-resource</title>
		<author>
			<persName><forename type="first">V</forename><surname>Vysotska</surname></persName>
		</author>
		<ptr target="https://victana.lviv.ua/nlp/n-grams" />
		<imprint>
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
