<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">Digital Content Processing Method for Biometric Identification of Personality Based on Artificial Intelligence Approaches</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author role="corresp">
							<persName><forename type="first">Eugene</forename><surname>Fedorov</surname></persName>
							<email>fedorovee75@ukr.net</email>
							<affiliation key="aff0">
								<orgName type="institution">Cherkasy State Technological University</orgName>
								<address>
									<settlement>Cherkasy</settlement>
									<country key="UA">Ukraine</country>
								</address>
							</affiliation>
							<affiliation key="aff1">
								<orgName type="department">E. O. Paton Electric Welding Institute</orgName>
								<address>
									<settlement>Kyiv</settlement>
									<country key="UA">Ukraine</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">Digital Content Processing Method for Biometric Identification of Personality Based on Artificial Intelligence Approaches</title>
					</analytic>
					<monogr>
						<imprint>
							<date/>
						</imprint>
					</monogr>
					<idno type="MD5">E6CACCA4D5A257EAA8714A1A5CD41918</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2023-03-24T13:08+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<textClass>
				<keywords>
					<term>digital content processing</term>
					<term>biometric identification of personality</term>
					<term>artificial neural network</term>
					<term>fuzzy inference systems</term>
					<term>genetic algorithm</term>
				</keywords>
			</textClass>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>The paper suggests a method for processing digital content for biometric identification based on artificial intelligence approaches. To get the goal the methods of forming digital content characteristics, creating a structure model of a system for processing digital content, the method of selecting the structure determination of parameter values of the mathematical model of digital content processing system are suggested. The suggested characterization of digital content automates the processing of digital content which increases the accuracy and speed of determining the values of signs. The suggested creation of a model structure of a digital content processing system provides knowledge in the form of easily accessible for human understanding rules that simplifies the process of determining the structure of the system and also allows parallel processing of information that allows increasing the learning speed. The suggested selection of structure method of determining values of model parameters of the processing system of the digital content based on the genetic algorithm uses a combination of directed and random search that decreases the probability of a hit in local extremum and provides an acceptable speed of determining values of the model parameters. The suggested method of digital content processing for biometric identification of a personality by voice can be used in various intelligent digital content processing systems.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1">Introduction</head><p>Human-machine interfaces are one of the directions of digital content processing. For these interfaces, biometric identification of a person is important.</p><p>Automated biometric identification of a person means decision making based on acoustic and visual information, which improves the quality of recognition of the person being studied <ref type="bibr" target="#b0">[1]</ref><ref type="bibr" target="#b1">[2]</ref><ref type="bibr" target="#b2">[3]</ref>. Unlike the traditional approach, computer biometric identification speeds up and improves the accuracy of the recognition process, which is especially critical in limited time conditions.</p><p>A special class of biometric identification of a person is formed by methods based on the analysis of acoustic information <ref type="bibr" target="#b3">[4]</ref><ref type="bibr" target="#b4">[5]</ref><ref type="bibr" target="#b5">[6]</ref><ref type="bibr" target="#b6">[7]</ref><ref type="bibr" target="#b7">[8]</ref>.</p><p>The methods of biometric identification of a person by voice include: dynamic programming <ref type="bibr" target="#b8">[9,</ref><ref type="bibr" target="#b9">10]</ref>; vector quantization <ref type="bibr" target="#b10">[11,</ref><ref type="bibr" target="#b11">12]</ref>; artificial neural networks <ref type="bibr" target="#b12">[13,</ref><ref type="bibr" target="#b13">14]</ref>; decision tree <ref type="bibr" target="#b14">[15]</ref>; Gaussian mixture models (GMM) <ref type="bibr" target="#b15">[16]</ref><ref type="bibr" target="#b16">[17]</ref><ref type="bibr" target="#b17">[18]</ref><ref type="bibr" target="#b18">[19]</ref>; their combination <ref type="bibr" target="#b19">[20]</ref>.</p><p>Artificial neural networks are the most popular methods. The advantages of neural networks consist in: the possibility of their training and adaptation; the ability to identify patterns in the data, their generalization, i.e. extracting knowledge from data, therefore, knowledge about the object is not required (for example, its mathematical model); parallel processing of information, which increases the computing power.</p><p>The disadvantages of neural networks include: the difficulty of determining the network structure, since there are no algorithms for calculating the number of layers and neurons in each layer for specific applications; the difficulty of forming a representative sample; a high probability of a learning method and adaptation getting into a local extremum; inaccessibility for human understanding of knowledge accumulated by the network (it is impossible to present the relationship between output and output in the form of rules), since it is distributed between all elements of the neural network and is presented in the form of its weighting coefficients.</p><p>Recently, neural networks have been combined with fuzzy inference systems. The advantages of fuzzy inference systems are the following: presentation of knowledge in the form of rules that are easily accessible for human understanding; no accurate assessment of variable objects is needed (incomplete and inaccurate data).</p><p>The disadvantages of fuzzy inference systems include: the impossibility of their training and adaptation (parameters of the membership functions cannot be automatically configured); the impossibility of parallel processing of information, which increases the computing power.</p><p>Since genetic algorithms can be used instead of neural network learning algorithms for training of membership function parameters, we note their advantages and disadvantages.</p><p>The advantages of genetic algorithms for neural networks training are the following: the probability of getting into a local extremum decreases.</p><p>The disadvantages of genetic algorithms for neural networks training are the following: the speed of the solution search method is lower than that of neural network training methods; in the case of binary genes, an increase in the search space reduces the accuracy of the solution with a constant chromosome length; in the case of binary genes, there are encoding/decoding operations that reduce the speed of the algorithm.</p><p>In this regard, it is relevant to create a method of digital content processing for biometric identification of a person, which will eliminate these drawbacks.</p><p>The aim of the work is to increase the efficiency of digital content processing system due to the artificial neuro-fuzzy network, which is trained on the basis of the genetic algorithm.</p><p>To achieve this goal, it is necessary to solve the following tasks:</p><p>1. Generation of digital content attributes.</p><p>2. Creation of a model of digital content processing system. 3. Choice of the structure of the method for determining the parameter values of the mathematical model of digital content processing system.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2">Generation of digital content attributes</head><p>The generation of digital content attributes in the case of biometric identification of a person by voice provides for the following steps:</p><p>─ determination of vocal segments of a speech signal based on statistical estimation of short-term energies; ─ definition of formants of the central frame of the vocal segment; ─ choice of vocal speech sound attributes based on formants of the central frame of the vocal segment.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.1">Determination of vocal segments of a speech signal based on statistical estimation of short-term energies</head><p>The paper proposes a method for determining vocal segments of a speech signal based on statistical estimation of short-term energies, which includes the following steps:</p><p>1. Set a speech signal with one vocal sound () yn , 1, f nN  . Set the number of quantization levels of a speech signal L (for an 8-bit sound sample 256 L </p><p>). Set the length of the frame N , on which the short-term energy is calculated, 21 b N  , where the integer parameter b is selected from the inequality  </p><formula xml:id="formula_0">  2 min 1 log s b f f b    , s f</formula><formula xml:id="formula_1">N mN E n y m n    , / 2 1, / 2 1 f n N N N     .</formula></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.">Calculate the mathematical expectation of short-term energies</head><formula xml:id="formula_2">; 6.2. If ( ) ( 1) E n T E n T     , then 1 l Nn , go to step 6.1; 6.3. If ( ) ( 1) E n T E n T     , then r Nn  , proceed to completion; 6.4. If 1 f n N N </formula><p>  , then go to the next sample, i.e. 1 nn , go to step 6.2, else r Nn  , proceed to completion. As a result, the left and right boundaries of the vocal segment are determined. For the method of formants determining, the frame with the center in the sample with the number</p><formula xml:id="formula_3">    /2 c l r N round N N </formula><p>is selected as the central frame.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.2">Definition of formants of the central frame of the vocal segment</head><p>The paper proposes a method for determining the formants of the central frame of the vocal segment based on linear prediction coding, which includes the following steps:</p><p>1. Perform through the low-pass filter the balancing of the spectrum having a steep decline in the high frequency region ( ) ( 1) ( )</p><formula xml:id="formula_4">s m s m s m     , / 2, / 2 cc m N N N N    ,</formula><p>where  is the filtration parameter, 01   .</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.">Calculate the autocorrelation function ()</head><formula xml:id="formula_5">Rk ( ) ( ) ( ) s m s m w m  , 2 ( ) 0.54 0.46 cos m wm N   , / 2 1 /2 ( ) ( ) ( ) c c N N k m N N R k s m s m k       , 0, kp  , where ()</formula><p>wm is the Hamming window, p is the linear prediction order, ( /1000) 5 ( /1000)</p><formula xml:id="formula_6">dd ceil f p ceil f    , ()</formula><p>ceil f is the function that rounds f to the nearest integer.</p><p>3. Calculate the LPC coefficients j a in accordance with the Durbin procedure <ref type="bibr" target="#b20">[21,</ref><ref type="bibr" target="#b21">22]</ref>:</p><formula xml:id="formula_7">3.1. (0) (0) ER  ; 3.2. 1 ( 1)<label>( 1) 1 ( ) ( )</label></formula><formula xml:id="formula_8">i ii ij j k R i R i j E            ; 3.3. () i ii k   ; 3.4. ( ) ( 1)<label>( 1)</label></formula><p>,1 1</p><formula xml:id="formula_9">i i i j j i i j k j i           ; 3.5. ( ) 2<label>( 1)</label></formula><p>(1 )</p><formula xml:id="formula_10">ii i E k E   ; 3.6. 1 ii ; 3.7. if</formula><p>ip  , then go to step 2; 3.8.</p><formula xml:id="formula_11">() ,1 p jj a j p     . 4. Calculate the gain coefficient G 1 (0) ( ) p k k G E R a R k      . 5.</formula><p>Calculate the logarithmic energy spectrum using the gain and LPC coefficients </p><formula xml:id="formula_12">                              , 0, 1 kN  6.</formula><p>Calculate the frequency and amplitude of the formant in the logarithmic energy spectrum of the central frame: 6.1. Set frequency number 0 k  . Set the number of formants 0 i  ; 6.2. If 10lg ( ) 10lg ( 1) 10lg ( ) 10lg ( 1) 10lg ( ) 0</p><formula xml:id="formula_13">W k W k W k W k W k        ,</formula><p>then fix the formant frequency, i.e.</p><p>1 i Fk   , and the formant amplitude, i.e.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1">10lg ( )</head><formula xml:id="formula_14">i A W k  </formula><p>, increase the number of local extremums, i.e. 1 ii ; 6.3. If 3 i  , then go to the next frequency, i.e. 1 kk , go to step 6.2.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.3">Choice of vocal speech sound features based on formants of the central frame of the vocal segment</head><p>The following vocal speech sound features have been chosen:</p><p>─ -the frequency of the first formant </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Creation of a model of digital content processing system</head><p>The proposed digital content processing system that performs biometric identification of a person by voice is the artificial neuro-fuzzy network, a graph model of which is shown in Fig. <ref type="figure">1</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Fig. 1. A graph model of digital content processing system.</head><p>The input (zero) layer contains (0)  NQ  neurons (corresponds to the number of features). The first hidden layer implements the fuzzification and contains (1)  N MQ  neurons (corresponds to the number of values of linguistic variables). The second hidden layer implements the aggregation of subconditions and contains (2)  NM  neurons (corresponds to the number of rules M ). The third hidden layer implements the activation of conclusions and contains The output layer implements the defuzzification and contains (5)  1 N  neuron. All weighting coefficients are equal to 1. The creation of the mathematical model of digital content processing system involves the following steps: ─ formation of a fuzzy rule base; ─ fuzzification; ─ aggregation of subconditions; ─ activation of conclusions; ─ aggregation of conclusions; ─ defuzzification.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.1">Formation of a fuzzy rule base</head><p>Imagine the j -th fuzzy rule in the form</p><formula xml:id="formula_15">: j R IF 1 x is 1 j  AND ... AND Q x is j N  THEN y is j  , … … … … … … … … … 1 x N x y 1 M</formula><p>zz where i x is the name of the input linguistic variable, 1, iN  ; y is the name of the output linguistic variable; j i  is the fuzzy variable (the value of the linguistic variable</p><formula xml:id="formula_16">i x ), 1, jM  , 1, iQ  ; j</formula><p> is the fuzzy variable (the value of the linguistic variable y ), 1, jM  .</p><p>The fuzzy set j i A is the range of values of the fuzzy variable j i  , the fuzzy set j B is the range of values of the fuzzy variable j  .</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2">Fuzzification</head><p>Let's determine the degree of truth of the i -th subcondition, i.e. let's establish the correspondence between the input variables i x of the j -th rule and the values of the membership function ()</p><formula xml:id="formula_17">j i i A x  .</formula><p>Since a number of methods related to person identification by voice use the Gauss function, we choose this function as ()</p><formula xml:id="formula_18">j i i A x  , i.e. 2 1 ( ) exp 2 j i j ii i j A i xm x            ,</formula><p>where j i m is the mathematical expectation, j i  is the standard deviation.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.3">Aggregation of subconditions</head><p>The membership function of the condition for the j -th rule is defined as 1 1 ( ) ( )... ( )</p><formula xml:id="formula_19">j j j n n AA A x x x     , 1, jM  .</formula></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.4">Activation of conclusions</head><p>The membership function of the conclusion for the j -th rule is defined as ( ) ( ) ( )</p><formula xml:id="formula_20">j j j C A B y x y     , 1, jM  ,     0, 0.5 ( 0.5) 0.5, 0.5 () ( 0.5) 0.5, 0.5 0, 0.5 j B xj x j j x j y j x j x j xj                     </formula><p>is a triangular function.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.5">Aggregation of subconditions</head><p>The membership function of the final conclusion is defined as ( ) max( ( ),..., ( ))</p><formula xml:id="formula_21">M C CC y y y     .</formula></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.6">Defuzzification</head><p>To obtain the class number, the membership function maximum method is used.</p><p>arg max ( )</p><formula xml:id="formula_22">j j C z yz  </formula><p>; j z is the center of the fuzzy set j C .</p><p>Thus, the mathematical model of digital content processing system (Fig. <ref type="figure">1</ref>) can be represented as 1, 1</p><p>arg max max ( ) ( )</p><formula xml:id="formula_23">jj k i Q k i B A jM z i y z x      , 1, kM  .</formula><p>The determination of the parameters of this system is carried out on the basis of the genetic algorithm.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>4</head><p>Choice of the structure of the method for determining parameter values of the mathematical model of digital content processing system</p><p>The choice of the structure of the genetic algorithm, which allows to determine parameter values of the mathematical model of digital content processing system, involves the following steps:</p><p>─ identification of individuals of the initial population; ─ definition of fitness function; ─ choice of reproduction (selection) operator; ─ choice of crossing-over operator; ─ choice of mutation operator; ─ choice of reduction operator; ─ definition of a stop condition.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.1">Identification of individuals of the initial population</head><p>Material genes have been selected for the following reasons:</p><p>─ -the ability to search in large spaces, which is difficult to do in the case of binary genes, when an increase in the search space reduces the accuracy of the solution with a constant chromosome length; ─ -the ability to configure solutions locally; ─ -the lack of encoding / decoding operations that are necessary for binary genes increases the speed of the algorithm; ─ -proximity to the formulation of the most applied problems (each material gene is responsible for one variable or parameter, which is impossible in the case of binary genes).</p><p>An ordered vector of parameters (mathematical expectations and standard deviations) acts as the chromosome, which represents the i -th individual of the population {} i Hh  </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.2">Definition of fitness function</head><p>In the paper the following fitness function, which corresponds to the probability of correct identification of a person by voice, is proposed ,</p><p>) max </p><formula xml:id="formula_25">jj ii P pp m p F I y d P      <label>, 1, 0 () 0,</label></formula></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.3">Choice of reproduction (selection) operator</head><p>The following effective combination is used to select parameter vectors for crossing and mutation as a reproduction operator</p><formula xml:id="formula_26">1 1 1 ( ) exp( 1/ ( )) (<label>2 2)</label></formula><formula xml:id="formula_27">(1 exp( 1/ ( ))) | | | | | | 1 i i P h g t a a g t H H H             .</formula><p>Thus, in the early stages of the genetic algorithm, an uniform selection is used to ensure that the entire search space is studied (random selection of chromosomes), and in the final stages, linearly ordered selection is used to make the search directed (the current best chromosomes are preserved). This combination does not require scaling and can be used to minimize fitness function.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.4">Choice of crossing-over (crossover, recombination) operator</head><p>To combine the two options of the vector of parameters selected by the reproduction operator, an uniform crossing-over is used as the crossing-over operator. Parents are selected through the following effective combinationin the early stages of the genetic algorithm, outbreeding is used to provide an investigation of the entire search space, and in the final stages, inbreeding is used to make the search di-rected. This combination does not require scaling and can be used to minimize fitness function.</p><p>After the selection of parents, a cross is carried out, and two descendants are produced.</p><p>For a global search for the optimal vector of parameters, it is necessary to increase the variety of options.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.5">Choice of mutation operator</head><p>To ensure the variety of options for the vector of parameters after crossing-over, an non-uniform mutation is used. The mutation step is defined as P is the initial probability of mutation. Thus, in the early stages of the genetic algorithm, a large step mutation occurs with high probability, which provides an investigation of the entire search space, and in the final stages, the probability of mutation and its step tend to zero, which makes the search directed.</p><formula xml:id="formula_28">( )<label>1</label></formula></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.6">Choice of reduction operator</head><p>The reduction operator allows to create a new population based on the previous population and parameter vectors obtained by crossing-over and mutation. As a reduction operator, a scheme ()   is applied that does not require scaling and can be used to minimize fitness function.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.7">Definition of a stop condition</head><p>The following condition is proposed in the work 1 max ( )</p><formula xml:id="formula_29">i i F h t T      .</formula><p>The values of  and T are calculated experimentally.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>6 .</head><label>6</label><figDesc>Determine the left and right borders of the vocal segment: 6.1. Set the sample number 1 n </figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_2"><head></head><label></label><figDesc>11 xF  ; ─ -the frequency of the second formant 22 xF  ; ─ -the frequency of the third formant 33 xF  ; ─ -the amplitude of the first anti-formant 41 xA  ; ─ -the amplitude of the second anti-formant 52 xA  ; ─ -the amplitude of the third anti-formant 63 xA  .The total number of features is denoted as 6 Q  .</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_4"><head></head><label></label><figDesc>response received from the object (person), p y is the response obtained by the model, P is the number of test implementations.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0"><head></head><label></label><figDesc></figDesc><graphic coords="1,0.00,190.95,595.32,460.02" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_2"><head></head><label></label><figDesc>the maximum and minimum values of the j -th gene; t is the iteration number; T is the maximum number of iterations; r is the random number, [0,1] r ; b is the parameter controlling the speed of step decrease, 0 b  . To simulate annealing, the probability of mutation is defined as</figDesc><table><row><cell></cell><cell>       </cell><cell>(</cell><cell cols="3">) 1 ij j Max h r j ij h Min r  t T T     t  b b     </cell><cell>, ,</cell><cell>r r</cell><cell> </cell><cell>0.5 0.5</cell><cell>,</cell></row><row><cell>where</cell><cell>,</cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell></row><row><cell cols="2">m P P 0 exp( 1/ ( )) g t </cell><cell>,</cell><cell>( ) g t</cell><cell>( 1) g t  , </cell><cell cols="4">01   ,</cell><cell>0 gT  , (0)</cell><cell>0 T  , 0</cell></row><row><cell>where</cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell></row><row><cell>0</cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell></row></table><note>jj Max Min are</note></figure>
		</body>
		<back>
			<div type="annex">
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Numerical research</head><p>Table <ref type="table">1</ref> presents the probabilities of a person identification by voice obtained on the basis of TIMIT based on the artificial neural network of the multilayer perceptron type and the proposed method. At the same time, the artificial neural network has had two hidden layers (each has consisted of six neurons, like the input layer).</p><p>According to Table <ref type="table">1</ref>, the proposed method gives the best results.</p><p>Table <ref type="table">1</ref>. The probability of biometric identification of a person by voice.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Method Identification probability</head><p>Artificial neural network 0.8 Proposed method 0.98</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6">Conclusions</head><p>1. To solve the problem of increasing the efficiency of digital content processing system for biometric identification of a person by voice, the corresponding speaker recognition methods have been investigated. These studies have shown that today the use of artificial neural networks in combination with the fuzzy inference system and the genetic algorithm is the most effective method.</p><p>2. The proposed method of digital content processing for biometric identification of a person by voice automates the process of generation of digital content features, provides a representation of knowledge in the form of rules that are easily accessible for human understanding, and simplifies the determination of the structure of the model due to the fuzzy inference system; reduces the probability of falling into a local extremum and provides an acceptable speed for determining the parameter values of the model by choosing the effective structure of the genetic algorithm; allows parallel processing of information due to the artificial neural network.</p><p>3. As a result of a numerical study, it has been found that the proposed method of digital content processing provides 0,98 probability of biometric identification of a person by voice, which exceeds the probability obtained by the artificial neural network such as a multilayer perceptron.</p><p>4. The proposed method of digital content processing for biometric identification of a person by voice can be used in various intelligent systems for digital content processing.</p></div>			</div>
			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<monogr>
		<title level="m" type="main">Guide to biometrics</title>
		<author>
			<persName><forename type="first">R</forename><forename type="middle">M</forename><surname>Bolle</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Connell</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Pankanti</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><forename type="middle">K</forename><surname>Ratha</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">W</forename><surname>Senior</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2004">2004</date>
			<publisher>Springer</publisher>
			<pubPlace>New York</pubPlace>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<monogr>
		<title level="m" type="main">Handbook of biometrics</title>
		<editor>Jain, A.K., Flynn, P., Ross, A.</editor>
		<imprint>
			<date type="published" when="2008">2008</date>
			<publisher>Springer</publisher>
			<pubPlace>New York, NY</pubPlace>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<monogr>
		<author>
			<persName><forename type="first">T</forename><surname>Dunstone</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Yager</surname></persName>
		</author>
		<title level="m">Biometric system and data analysis: design, evaluation, and data mining</title>
				<meeting><address><addrLine>New York</addrLine></address></meeting>
		<imprint>
			<publisher>Springer</publisher>
			<date type="published" when="2009">2009</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">Applications of Speaker Recognition</title>
		<author>
			<persName><forename type="first">N</forename><surname>Singh</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Khan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Shree</surname></persName>
		</author>
		<idno type="DOI">10.1016/j.proeng.2012.06.363</idno>
	</analytic>
	<monogr>
		<title level="j">Procedia Engineering</title>
		<imprint>
			<biblScope unit="volume">38</biblScope>
			<biblScope unit="page" from="3122" to="3126" />
			<date type="published" when="2012">2012</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<monogr>
		<title level="m" type="main">Speaker authentication</title>
		<author>
			<persName><forename type="first">Q</forename><surname>Li</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2012">2012</date>
			<publisher>Springer-Verlag</publisher>
			<pubPlace>Berlin Heidelberg; Heidelberg</pubPlace>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<monogr>
		<title level="m" type="main">Automatic speech and speaker recognition: large margin and kernel methods</title>
		<author>
			<persName><forename type="first">J</forename><surname>Keshet</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Bengio</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2009">2009</date>
			<publisher>John Wiley &amp; Sons</publisher>
			<pubPlace>Chichester</pubPlace>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<monogr>
		<title level="m" type="main">Self-learning speaker identification: a system for enhanced speech recognition</title>
		<author>
			<persName><forename type="first">T</forename><surname>Herbig</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Gerl</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Minker</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2013">2013</date>
			<publisher>Springer</publisher>
			<pubPlace>Berlin</pubPlace>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<analytic>
		<title level="a" type="main">Speaker recognition: a tutorial</title>
		<author>
			<persName><forename type="first">J</forename><surname>Campbell</surname></persName>
		</author>
		<idno type="DOI">10.1109/5.628714</idno>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the IEEE</title>
				<meeting>the IEEE</meeting>
		<imprint>
			<date type="published" when="1997">1997</date>
			<biblScope unit="volume">85</biblScope>
			<biblScope unit="page" from="1437" to="1462" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b8">
	<analytic>
		<title level="a" type="main">An Overview of Speaker Identification: Accuracy and Robustness Issues</title>
		<author>
			<persName><forename type="first">R</forename><surname>Togneri</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Pullella</surname></persName>
		</author>
		<idno type="DOI">10.1109/MCAS.2011.941079</idno>
	</analytic>
	<monogr>
		<title level="j">IEEE Circuits and Systems Magazine</title>
		<imprint>
			<biblScope unit="volume">11</biblScope>
			<biblScope unit="page" from="23" to="61" />
			<date type="published" when="2011">2011</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b9">
	<monogr>
		<title level="m" type="main">Fundamentals of speaker recognition</title>
		<author>
			<persName><forename type="first">H</forename><surname>Beigi</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2011">2011</date>
			<publisher>Springer</publisher>
			<pubPlace>New York</pubPlace>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b10">
	<analytic>
		<title level="a" type="main">An overview of automatic speaker recognition technology</title>
		<author>
			<persName><forename type="first">D</forename><forename type="middle">A</forename><surname>Reynolds</surname></persName>
		</author>
		<idno type="DOI">10.1109/ICASSP.2002.5745552</idno>
	</analytic>
	<monogr>
		<title level="j">IEEE International Conference on Acoustics Speech and Signal Processing</title>
		<imprint>
			<biblScope unit="volume">4</biblScope>
			<biblScope unit="page" from="4072" to="4075" />
			<date type="published" when="2002">2002</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b11">
	<analytic>
		<title level="a" type="main">An overview of text-independent speaker recognition: From features to supervectors</title>
		<author>
			<persName><forename type="first">T</forename><surname>Kinnunen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Li</surname></persName>
		</author>
		<idno type="DOI">10.1016/j.specom.2009.08.009</idno>
	</analytic>
	<monogr>
		<title level="j">Speech Communication</title>
		<imprint>
			<biblScope unit="volume">52</biblScope>
			<biblScope unit="page" from="12" to="40" />
			<date type="published" when="2010">2010</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b12">
	<analytic>
		<title level="a" type="main">Robust text-independent speaker identification using Gaussian mixture speaker models</title>
		<author>
			<persName><forename type="first">D</forename><surname>Reynolds</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Rose</surname></persName>
		</author>
		<idno type="DOI">10.1109/89.365379</idno>
	</analytic>
	<monogr>
		<title level="j">IEEE Transactions on Speech and Audio Processing</title>
		<imprint>
			<biblScope unit="volume">3</biblScope>
			<biblScope unit="page" from="72" to="83" />
			<date type="published" when="1995">1995</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b13">
	<analytic>
		<title level="a" type="main">Speaker Recognition based on a Novel Hybrid Algorithm</title>
		<author>
			<persName><forename type="first">F.-Z</forename><surname>Zeng</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Zhou</surname></persName>
		</author>
		<idno type="DOI">10.1016/j.proeng.2013.08.007</idno>
	</analytic>
	<monogr>
		<title level="j">Procedia Engineering</title>
		<imprint>
			<biblScope unit="volume">61</biblScope>
			<biblScope unit="page" from="220" to="226" />
			<date type="published" when="2013">2013</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b14">
	<analytic>
		<title level="a" type="main">Speech recognition of deaf and hard of hearing people using hybrid neural network</title>
		<author>
			<persName><forename type="first">C</forename><surname>Jeyalakshmi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Krishnamurthi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Revathi</surname></persName>
		</author>
		<idno type="DOI">10.1109/ICMEE.2010.5558589</idno>
	</analytic>
	<monogr>
		<title level="m">2nd International Conference on Mechanical and Electronics Engineering</title>
				<imprint>
			<date type="published" when="2010">2010. 2010</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b15">
	<analytic>
		<title level="a" type="main">Comparison of Text Independent Speaker Identification Systems using GMM and i-Vector Methods</title>
		<author>
			<persName><forename type="first">P</forename><surname>Nayana</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Mathew</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Thomas</surname></persName>
		</author>
		<idno type="DOI">10.1016/j.procs.2017.09.075</idno>
	</analytic>
	<monogr>
		<title level="j">Procedia Computer Science</title>
		<imprint>
			<biblScope unit="volume">115</biblScope>
			<biblScope unit="page" from="47" to="54" />
			<date type="published" when="2017">2017</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b16">
	<analytic>
		<title level="a" type="main">Speech to text converter using Gaussian mixture model (GMM)</title>
		<author>
			<persName><forename type="first">V</forename><surname>Chauhan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Sh</forename><surname>Dwivedi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Karale</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">M</forename><surname>Potdar</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">International Research Journal of Engineering and Technology (IRJET)</title>
		<imprint>
			<biblScope unit="volume">3</biblScope>
			<biblScope unit="page" from="160" to="164" />
			<date type="published" when="2016">2016</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b17">
	<analytic>
		<title level="a" type="main">Automatic speaker recognition using Gaussian mixture speaker models</title>
		<author>
			<persName><forename type="first">D</forename><forename type="middle">A</forename><surname>Reynolds</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">IEEE Transactions on Speech and Audio Processing</title>
		<imprint>
			<biblScope unit="volume">3</biblScope>
			<biblScope unit="page" from="1738" to="1752" />
			<date type="published" when="1995">1995</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b18">
	<analytic>
		<title level="a" type="main">Method for parametric identification of Gaussian mixture model based on clonal selection algorithm</title>
		<author>
			<persName><forename type="first">E</forename><surname>Fedorov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Lukashenko</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Utkina</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Rudakov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Lukashenko</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">CEUR Workshop Proceedings</title>
				<imprint>
			<date type="published" when="2019">2019</date>
			<biblScope unit="volume">2353</biblScope>
			<biblScope unit="page" from="41" to="55" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b19">
	<analytic>
		<title level="a" type="main">Combination of PNN network and DTW method for identification of reserved words, used in aviation during radio negotiation</title>
		<author>
			<persName><forename type="first">V</forename><forename type="middle">J</forename><surname>Larin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><forename type="middle">E</forename><surname>Fedorov</surname></persName>
		</author>
		<idno type="DOI">10.3103/S0735272714080044</idno>
	</analytic>
	<monogr>
		<title level="j">Radioelectronics and Communications Systems</title>
		<imprint>
			<biblScope unit="volume">57</biblScope>
			<biblScope unit="page" from="362" to="368" />
			<date type="published" when="2014">2014</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b20">
	<monogr>
		<title level="m" type="main">Fundamentals of speech recognition</title>
		<author>
			<persName><forename type="first">L</forename><forename type="middle">R</forename><surname>Rabiner</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B.-H</forename><surname>Juang</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2005">2005</date>
			<publisher>Pearson Education</publisher>
			<pubPlace>Delhi</pubPlace>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b21">
	<monogr>
		<title level="m" type="main">Linear prediction of speech</title>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">D</forename><surname>Markel</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">H</forename><surname>Gray</surname></persName>
		</author>
		<imprint>
			<date type="published" when="1976">1976</date>
			<publisher>Springer-Verlag</publisher>
			<pubPlace>Berlin</pubPlace>
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
