1. Introduction

The syllable intelligibility in the system of information transmission by speech signals depending on the intensity of acoustic noise

Yu A Kropotov

A A Belov

A A Kolpakov

A Yu Proskuryakov

0 0 Murom Institute (branch) «Vladimir State University named after Alexander and Nicholay Stoletovs» , Orlovskaya street, 23, Murom, Vladimir Region, Russia, 602264

2019

277 282

The paper investigates the effect of the signal-to-noise ratio on syllable intelligibility under the intense influence of external acoustic interference when exchanging voice messages in telecommunication systems of public address systems. The article discusses the effect on the syllable intelligibility of the signal / external acoustic noise ratio, examines the effect of the integral articulation index, the dependence of the perception coefficient of formants on the relative level of formant intensity, the dependence of the formant parameter on the geometric mean frequency of the i-th spectrum of the speech signal. In accordance with the results of studies of the integral articulation index depending on the signal-to-noise ratio, a function of syllable intelligibility depending on the signal-to-noise ratio was obtained, using which it is possible to determine the maximum value of the output signal-to-noise ratio in the audio exchange telecommunications system to obtain a given syllable intelligibility. At the same time, experimentally determined the value of the signal-to-noise ratio in the telecommunications system of audio exchange to obtain a syllable intelligibility of at least 93% for ensure full perception of the transmitted speech information.

1. Introduction

As it is known, the main criterion of efficiency of the system of telecommunication exchange of the speech information is syllabic legibility S % or the size of an estimation of a speech signal on scale MOS (Mean Opinion Score) [1].

Telecommunication systems of audio exchange, in particular loudspeaker systems, are considered to be effective if the transmitted speech information is perceived by the object completely and without difficulties, the syllable intelligibility in this case is not less than 93% [ 1,2,4 ] or the MOS score should be not less than 3,9 points on a five-point scale [ 5, 6 ].

2. Formulation of the problem

Dependence of syllabic intelligibility in the system of telecommunication exchange of speech information on the influence of various factors has been studied in a number of works [ 1,3 ]. However, the information in the known sources [ 9, 10 ] about the influence of the signal-to-noise ratio on the syllabic legibility on the side of receiving speech messages for the case of operational-command telecommunication systems is insufficient, so this article considers the problem of determining the influence of the signal-to-noise ratio on the syllabic legibility in telecommunication audio exchange systems.

The known results of the studies of the assessment of syllabic legibility by the instrumentalcalculation method are shown in Fig. 1 [ 1, 3 ]. (1) (3) (4) 3. Instrumental-calculation method for estimating the integral articulation index and syllabic legibility The value of the integral articulation index R depending on the value of the spectral articulation index Ri is determined by the expression

N R = ∑ Ri .

i=1 The articulation spectral index is calculated by the expression

Ri = pi · ki , (2) where pi is formant coefficient, ki is weighting coefficient of the presence of formant speech in the i-th band.

The coefficient of perception of formant pi is calculated using the expression [ 3 ]  0,78 + 5,46 ⋅ exp[− 4,3 ⋅10−3 ⋅ (27,3 − Qi )2 ], if Qi ≤ 0;  1 + 100,1⋅Q pi =  1 − 0,78 + 5,46 ⋅ exp[− 4,3 ⋅10−3 ⋅ (27,3 − Qi )2 ], if Qi > 0,  1 + 100,1⋅Q where Qi = qi - ΔAi is the relative intensity level of the format.

Or the value of the perception coefficient pi format can be determined by the graph in Figure 2. The format parameter ΔAi is determined by the graph in Figure 3 or by the expression 200 / f 0,43 − 0.37, если f ≤ 1000 Гц, ∆A( f ) = 

1,37 + 1000 / f 0,69 , если f > 1000 Гц, where f ср.i = f вi − f нi is average geometric frequency, fнi is lower frequency of the i-th bandwidth of the speech spectrum, fв is upper frequency of the i-th bandwidth of the spectrum.

For each i-th (i=1, 2, ... N) frequency band at the average geometric frequency fср.i = fвi − f нi , a formal parameter ΔAi is determined, characterizing the energy redundancy of discrete components of the speech signal.

4. Results of experiments

Let's take the number of octave bands N=5. Values of the accepted limits by frequency of octave bands, values of calculated fsr.i and values of formal parameters ΔAi are given in Table 1.

With the help of expression Qi = qi - ΔAi , we determined the values of intensity levels of format Qi depending on the signal to noise ratio qi. The calculated values of Qi are summarized in Table 2.

With the help of expression (3) or according to the diagram in Figure 2, the formatting factor pi is determined depending on Qi for i-th bands, with different values of signal-to-noise ratio, dB. The calculated pi values for different qi are summarized in Table 3.

Qi = qi - ΔAi qi, дБ qi = 0 дБ qi = 3 дБ qi = 6 дБ qi = 10 дБ qi = 20 дБ qi = 30 дБ

2,57 ⋅10−8 ⋅ f 2,4 , если 100 < f ≤ 400 Гц; k( f ) =  1 −1,074 ⋅ exp(−10−4 ⋅ f 1,18 ), если 400 < f ≤ 10000 Гц; (5) or according to the chart in Figure 4.

The results of calculations of the weighting coefficients of probability of formant speech in the i-th band are presented in Table 4.

Calculation of the Ri articulation spectral index is performed by formula (2). Calculations of Ri, at different values of signal-to-noise ratio are summarized in Table 5.

Ri = pi·ki qi = 0 dB qi = 3 dB qi = 6 dB qi = 10 dB qi = 20 dB qi = 30 dB

According to the results of calculations of the spectral articulation index Ri, summarized in Table 5, it became possible to calculate the integral articulation index depending on the signal-to-noise ratio. The results of the calculation of the integral articulation index made it possible to find the values of syllabic legibility depending on the signal-to-noise ratio, which are summarized in Table 6. 5.

The graph of the syllable intelligibility function S from the signal-to-noise ratio is shown in Figure 1

2 1 – english speech 2 – russian speech S,% 90 80 70 60 50 40 30 20 10 0 3 6 10 20 30 q , dB

5. Conclusions

As can be seen from the graphs in Figure 5, the syllable intelligibility of the voice messaging telecommunications system is ensured by S≥93% for signal/noise ratio q≥20 dB [ 7, 8 ]. Thus, the dependence of syllabic intelligibility on signal-to-noise ratio, which is important for the practice of telecommunication systems, is obtained. It shows that for effective transmission of speech information by the command and control system of telecommunications, for obtaining, respectively, syllabic intelligibility of S≥93%, in the system for transmission of speech messages, it is necessary to provide signal-to-noise ratio q≥20 dB on the receiving side of messages.

6. References

[1] Sapozhkov M A 1962 Speech signal in cybernetics and communications (Moscow: Svyazizdat) p 452

[2] GOST

50840 -95 Speech transmission via communication channels . Methods to assess quality, legibility and recognizability

[3] Zheleznyak

V K

, Makarov Y K and Khoreev A A 2000

Some methodical approaches toevaluation of efficiency of speech information protection

Special technique 4 39 - 45

[4] Cohen

, Benesty

and Gannot

S 2010

Speech processing in modern communication (Berlin, Heidelberg: Springer) p 342

[5] Hansler

and Schmidt

G 2006

Topics in acoustic echo and noise control: Selected methods for the cancelation of acoustic echoes, the reduction of background noise, and speech processing (Berlin, Heidelberg: Springer) p 642

[6] Kahrs

and Brandenburg K 2002 Applications of digital signal processing to audio and acoustics (New York: Kluwer Academic Publisher) p 572

[7] Kropotov

Y A

and Belov

A A

2016

Application method of barrier functions in the problem of estimating the probability density of the parameterized approximations 13th

International Scientific-Technical Conference on Actual Problems of Electronic Instrument Engineering 69 - 72

[8] Kolpakov

A A

and Kropotov

Y A

2017

Advanced mixing audio streams for heterogeneous computer systems in

telecommunications CEUR Workshop Proceedings 1902 32 - 36

[9] Ryabenkyi

V S

2012

Mathematical model of the external noise suppression devices in the subarea of space

Mathematical modeling 24 ( 8 ) 3 - 31

[10] McAulay R and Malpass M 1980

Speech enhancement using a soft-decision noise suppression filter IEEE Trans, on Acoustics, Speech, and

Signal Processing 28 ( 2 ) 137 - 145

[11] Kropotov

Y A

, Belov

A A

and Proskuryakov

A Y

2018

Method for forecasting changes in time series parameters in digital information management systems

Computer Optics 42 ( 6 ) 1093 - 1100 DOI: 10.18287/ 2412 -6179-2018-42-6- 1093 -1100