=Paper= {{Paper |id=Vol-3396/paper21 |storemode=property |title=Improvement of MVDR Beamformer’s Performance Based on Spectral Mask |pdfUrl=https://ceur-ws.org/Vol-3396/paper21.pdf |volume=Vol-3396 |authors=Quan Trong The |dblpUrl=https://dblp.org/rec/conf/colins/The23a }} ==Improvement of MVDR Beamformer’s Performance Based on Spectral Mask== https://ceur-ws.org/Vol-3396/paper21.pdf
Improvement of MVDR Beamformer’s Performance Based on
Spectral Mask
Quan Trong The
Digital Agriculture Cooperative, Cau Giay, Ha Noi, Viet Nam.


                 Abstract
                 In many speech applications, such as source tracking, hearing aids, augmented reality,
                 teleconferencing, robot audition; acoustic beamforming is routinely implemented to enhance
                 the speech quality, speech intelligibility of captured microphone array signals in many real-
                 world recording situations. The designed beamformer uses priori information to form a spatial
                 beampattern, which moves towards the target sound source while eliminating all surrounding
                 noise and interferences. However, robust performance in annoying scenarios still exists as a
                 challenging task, due to several reasons. In this article, the author proposed a spectral mask,
                 which applied to Minimum Variance Distortionless Response beamformer to improve the
                 speech enhancement. The resulting experiment shows that the advantage of suggested
                 technique was confirmed in increasing the signal-to-noise ratio from 5.2 (dB) to 6.2 (dB) and
                 reduce speech distortion to 3.2 (dB). The author’s proposed approach consistently ensures
                 enhancing perceptual quality metrics compared to the conventional beamformer.

                 Keywords 1
                 microphone array, minimum variance distortionless response, speech enhancement, the signal-
                 to-noise ratio (SNR), perceptual quality, robust performance

1. Introduction




 Figure 1: The complex surrounding environment around the target speaker

  The utilizing of microphone arrays (MA) [1-9] and its technique beamforming has become widely
commonly used in almost speech applications, such as robot audition, teleconferencing, mobile phones,

COLINS-2023: 7th International Conference on Computational Linguistics and Intelligent Systems, April 20–21, 2023, Kharkiv, Ukraine
EMAIL: quantrongthe1984@gmail.com
ORCID: 0000 - 0002 - 2456 - 9598
              ©️ 2020 Copyright for this paper by its authors.
              Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
              CEUR Workshop Proceedings (CEUR-WS.org)
hearing aids, surveillances devices, virtual assistants. These devices require acquiring desired speech
from a target direction in presence of third-party talker, complex annoying noise, and unwanted
interferences from the other directions. In a special recording scenario, when the talker is far from
microphones, the received signal - to - noise ratio (SNR) will be inadequate for further signal processing
and in these cases the spatial filtering can’t provide high speech quality or little distortion. The existing
beamformers outperform well in laboratory conditions but may less well in real-world situations, which
contains multiple undetermined noise source, interfering sound sources with locations and
characteristics vary with times and non-stationary.
    Acoustics beamforming are conveniently installed in the short time Fourier transform (STFT)
domain. In each time - frequency cell, the complex value of final output signal is derived by 𝒘𝐻 𝒚,
where 𝒘 is the optimum coefficients that related to the designed beamformer’s properties. When
choosing 𝒘, a common purpose of the constrained criteria is to maximize the SNR of the beamformer
output signal with minimizing the total output noise power. For obtaining this goal, it is convenient to
calculate the direction of arrival (DOA) of interest signal 𝜃𝑠 , the steering vector of target speaker
𝒅𝑠 (𝑓, 𝜃𝑠 ), which indicates the frequency response of the target sound source and each element of MA,
and MA’s geometry distribution.




 Figure 2: Microphone array beamforming is used for separation of sound source

   Minimum Variance Distortionless Response (MVDR) [10-17] beamformer is one of the most
importance MA beamforming, which use the a priori information of 𝜃𝑠 , 𝒅𝑠 (𝑓, 𝜃𝑠 ) and the covariance
matrix of observed MA signals to find the optimum solution 𝒘. Consequently, MVDR beamformer
probably the most commerce beamforming technique. A lot of research, which referred to robust
MVDR, has been proposed, evaluated in real-world experimental conditions to avoid speech distortion.
As a rule, these algorithms are performed by extending the spatial region. Nevertheless, even assuming
perfect the DOA of useful talker or sound source localization, the different microphone sensitivities and
directional responses make the performance of MVDR beamformer is not handle well. Therefore,
speech distortion is the existing problem of MA.
   In this paper, the author considers the problem of preserving the original speech acquisition in noisy
environment. Since surrounding noise greatly corrupts the speech enhancement, high quality noise
reduction is an essential problem in MVDR’s performance. While precise estimation of steering vector
plays a major role for robust MA beamforming, in practical situations, the priori information of steering
vector is often based on the knowledge of MA geometry and plan wave propagation of sound source.
To overcome this limitation, recently, a time - frequency mask - based research direction has been
proposed that enhances the MVDR beamformer’s evaluation. The central idea is suppressing the speech
component in the microphone array signal.




 Figure 3: The principal extracting the desired talker by using microphone array

   In this paper, the author suggested using a suitable spectral mask, which uses an appropriate
modified coherence - valued of surrounding noise, and desired signal. The illustrated experiments have
confirmed the effectiveness of the proposed method through comparison of the conventional MVDR
beamformer (MVDR-conventional) and the suggested technique (SLM) in terms of SNR.
   This contribution is organized as follows. The second section describes the principal working of
MVDR beamformer. Section III will analyze the suggested ideal of SLM and the experiments will be
evaluated in Section IV. Finally, the Conclusion and the direction of the author’s research.

2. The model signal




 Figure 4: The scheme of MVDR beamformer’s performance
   In this section, the principal working of MVDR beamformer is presented in Figure 4. MVDR
beamforming uses the spatial information about the direction - of - arrival of useful talker and minimizes
the total noise power output for preserving the target speech component. Consequently, MVDR
beamformer is based on the constrained problem to extracting desired speaker while suppressing all
background noise without speech distortion. The scheme of the implementation of MVDR beamformer
with dual – microphone system (DMA2) [19-25, 28] can be written as the following way in the
frequency domain.
   Two captured microphone array signals are denoted by 𝑋1 (𝑓, 𝑘), 𝑋2 (𝑓, 𝑘) with the frequency index
𝑓 and frame index 𝑘, respectively. The representation in short - time Fourier transform as:

                              𝑋1 (𝑓, 𝑘) = 𝑆(𝑓, 𝑘)𝑒 𝑗𝛷𝑠 + 𝑉1 (𝑓, 𝑘)                                   (1)
                              𝑋2 (𝑓, 𝑘) = 𝑆(𝑓, 𝑘)𝑒 −𝑗𝛷𝑠 + 𝑉2 (𝑓, 𝑘)                                  (2)

    Where 𝑆(𝑓, 𝑘): the desired speech component, additive noise 𝑉1 (𝑓, 𝑘), 𝑉2 (𝑓, 𝑘), 𝜃𝑠 direction of
arrival of interest talker, the distance between two microphones 𝑑, speed propagation of sound in the
fresh air is 𝑐 (343 m/s), 𝜏0 = 𝑑/𝑐 is the sound delay and 𝛷𝑠 = 𝜋𝑓𝜏0 𝑐𝑜𝑠(𝜃𝑠 ).
    Without generality, we can denote 𝑫(𝑓, 𝜃𝑠 ) is the steering vector, 𝑫(𝑓, 𝜃𝑠 ) = [𝑒 𝑗𝛷𝑠 𝑒 −𝑗𝛷𝑠 ]𝑇 ,
         𝑿(𝑓, 𝑘) = [𝑋1 (𝑓, 𝑘) 𝑋2 (𝑓, 𝑘)]𝑇 and 𝑽(𝑓, 𝑘) = [𝑉1 (𝑓, 𝑘) 𝑉2 (𝑓, 𝑘)]𝑇 with symbol 𝑇
indicates transpose operator. The equations (1-2) can be expressed as the above formulation:

                              𝑿(𝑓, 𝑘) = 𝑆(𝑓, 𝑘)𝑫(𝑓, 𝜃𝑠 ) + 𝑽(𝑓, 𝑘)                                   (3)

   In almost digital signal processing algorithm, the important requirements is finding an optimum
appropriate solution 𝑾(𝑓, 𝑘), which adjust the final output signal 𝑆̂(𝑓, 𝑘) is approximately the original
𝑆(𝑓, 𝑘):

                                   𝑆̂(𝒇, 𝒌) = 𝑾𝐻 (𝑓, 𝑘)𝑿(𝑓, 𝑘)                                       (4)

   Where symbol 𝐻 is Hermitian conjugation.
   The constrained of saving the desired target speech while alleviating, minimizing the total output
noise power without speech distortion can be expressed in a mathematical formulation as:

                𝑚𝑖𝑛                                                                                  (5)
                       𝑾𝐻 (𝑓, 𝑘)𝑷𝑉𝑉 (𝑓, 𝑘)𝑾(𝑓, 𝑘) 𝑠. 𝑡. 𝑾𝐻 (𝑓, 𝑘)𝑫(𝑓, 𝜃𝑠 ) = 1
               𝑾(𝑓, 𝑘)

   where 𝑷𝑉𝑉 (𝑓, 𝑘) = 𝐸{𝑽(𝑓, 𝑘)𝑽∗ (𝑓, 𝑘)} is a covariance matrix of noise signals. (5) leads to the
coefficients of MVDR beamformer:

                                               𝑷−1𝑉𝑉 𝑫(𝑓, 𝜃𝑠 )                                       (6)
                               𝑾(𝑓, 𝑘) =
                                           𝑫𝐻 (𝑓, 𝜃𝑠 )𝑷−1
                                                       𝑉𝑉 𝑫(𝑓, 𝜃𝑠 )


   Unfortunately, in real - life recording situations, the information about noise often can’t be precisely
calculated or correctly estimated. And the covariance matrix of observed microphone arrays signals is
used instead of. 𝑷𝑋𝑋 (𝑓, 𝑘) = 𝐸{𝑿(𝑓, 𝑘)𝑿∗ (𝑓, 𝑘)} of received microphone signals are determined by:

                                 𝑃𝑋 𝑋 (𝑓, 𝑘) ∗ 1.001      𝑃𝑋1 𝑋2 (𝑓, 𝑘)                              (7)
                   𝑷𝑋𝑋 (𝑓, 𝑘) = { 1 1                                      }
                                     𝑃𝑋2 𝑋1 (𝑓, 𝑘)   𝑃𝑋2 𝑋2 (𝑓, 𝑘) ∗ 1.001

   where 𝑃𝑋𝑖𝑋𝑗 (𝑓, 𝑘), 𝑃𝑋𝑖𝑋𝑖 (𝑓, 𝑘), 𝑖, 𝑗 ∈ {1,2} computed as:

                  𝑃𝑋𝑖𝑋𝑗 (𝑓, 𝑘) = (1 − 𝛼)𝑃𝑋𝑖𝑋𝑗 (𝑓, 𝑘 − 1) + 𝛼𝑋𝑖∗ (𝑓, 𝑘)𝑋𝑗 (𝑓, 𝑘)                      (8)
   Where 𝛼 is the smoothing parameter, which in the range {0 … 1}.
   Finally, the received optimized solution of conventional MVDR beamformer is:
                                                 −1
                                               𝑷𝑋𝑋  𝑫(𝑓, 𝜃𝑠 )                                    (9)
                              𝑾(𝑓, 𝑘) = 𝐻             −1
                                          𝑫 (𝑓, 𝜃𝑠 )𝑷𝑋𝑋  𝑫(𝑓, 𝜃𝑠 )

3. The suggested spectral mask

    The ideal of spectral mask 𝑆𝐿𝑀(𝑓, 𝑘) is based on the estimation of a priori SNR. And the 𝑆𝐿𝑀(𝑓, 𝑘)
is derived in the following equation:

                                                       1                                         (10)
                                  𝑆𝐿𝑀(𝑓, 𝑘) =
                                                 1 + 𝑆𝑁𝑅(𝑓, 𝑘)

   In [26], an estimation of the signal - to - noise ratio is derived by:

                                                     𝛤𝑛 − 𝛤𝑥                                     (11)
                                      𝑆𝑁𝑅(𝑓, 𝑘) =
                                                     𝛤𝑥 − 𝛤𝑠

   Where 𝛤𝑥 , 𝛤𝑠 , 𝛤𝑛 is the coherence function between two microphone array signals, the complex
coherence function of the desired signal and the coherence of surrounding noisy environment.
   We can predict the appropriate model, which presents exactly these coherence functions due to many
factors. Based on the working [27], the authors use the formulation as:

                                              𝛤𝑛 − 𝑅𝑒{𝛤𝑠∗ 𝛤𝑥 }                                    (11)
                                  𝑆𝑁𝑅(𝑓, 𝑘) =         ∗ }
                                               𝑅𝑒{𝛤𝑠 𝛤𝑥 − 1
   Therefore, microphone array signal, 𝑋1 (𝑓, 𝑘), 𝑋2 (𝑓, 𝑘) are pre - processed as the following way to
suppress the speech component.

                                𝑋̂1 (𝑓, 𝑘) = 𝑋1 (𝑓, 𝑘) × 𝑆𝐿𝑀(𝑓, 𝑘)                               (12)
                                𝑋̂2 (𝑓, 𝑘) = 𝑋2 (𝑓, 𝑘) × 𝑆𝐿𝑀(𝑓, 𝑘)                               (13)

    The spectral mask allows outperforming the MVDR’s evaluation more robust. In the next section,
the authors demonstrated an experiment in coherence noise field.

4. Experiments




 Figure 5: The illustrated scheme of experiment
    In this section, the author performed an illustrated experiment with a target desired speaker, who
stand at distance 𝐿 = 2(𝑚) related to a DMA2 at direction 𝜃𝑠 = 900. The distance between two
microphones is 𝑑 = 5(𝑐𝑚). The recording situation in a living room, where still exists coherence noise
field.
         The purpose is verifying the effectiveness of the proposed spectral mask (SLM) in comparison
with the MVDR-conventional in terms of increasing the speech quality and reducing speech distortion.
An objective measurement [18] is used for calculating the speech quality. The noisy signal is captured
with DAM2 at 𝐹𝑠 = 16𝑘𝐻𝑧. For further signal processing, these necessarily parameters are used:
𝑁𝐹𝐹𝑇 = 512, overlap 50%, smoothing parameter 𝛼 = 0.5. Figure 6 shows the waveform of
microphone array signal.




 Figure 6: The waveform of microphone array signal

     By applying the conventional MVDR beamformer, the resulting output signal is derived in Figure
7.




 Figure 7: The waveform of processed signal by MVDR - conventional

       The spectral mask allows removing the speech component at the MVDR beamformer’s input
and enhances the overall performance. The received signal is shown in Figure 8.
 Figure 8: The waveform of processed signal by using spectral mask - SLM

   In the comparison the energy of microphone array signal, the processed signals by MVDR –
conventional and SLM, we can see that SLM reduced speech diction to 3.2 (dB), and increase the speech
quality in terms of the signal-to-noise ratio (SNR) from 5.2 (dB) to 6.2 (dB).




 Figure 9: The energy of microphone array signal, MVDR – conventional and SLM

Table 1.
The signal-to-noise ratio (SNR)

     Method Estimation          Microphone array               MVDR -                       SLM
                                   signal                   conventional
    NIST STNR                         3.5                        18.3                        23.6
    WADA SNR                          1.7                        19.5                        25.7

   In this demonstrated experiment, the advantage of the suggested spectral mask has been proven. The
obtained result is very promising in improvement of speech enhancement by MVDR beamformer,
which is the most widely common installed MA configuration in almost acoustic device. Speech
degradation or corrupted the output signal still a problem with digital signal processing algorithms, the
author exploits the priori information about the direction of arrival of interest signal, the properties of
surrounding environment to form an appropriate spectral mask to suppress the speech component and
improve MVDR beamformer’s performance. The proposed method, which is easy to implement and
owns low computation, can be applied into multi - microphones system.

5. Conclusion
   Target speech separation methods extract desired speaker from noisy mixture of speech, background
noise when interfering sources and third - party talker exits. These designed algorithms serve as
essential front - ends for many speech communication systems, such as speech recognition, digital
hearing aid devices, surveillance, smart home, speaker verification, teleconferencing systems.
Consequently, digital signal processing by MA beamforming is an important part in almost speech
applications. In this contribution, the author demonstrated an additive useful spectral mask, which
suppresses the speech component in the MA signals to enhance the MVDR beamformer’s performance.
The numerical results confirmed the suggested technique in terms of increasing the speech quality and
perceptual quality metric of the final output signal from 5.2 (dB) to 6.2 (dB) and reducing speech
distortion to 3.2 (dB). The author’s future working is combination with surrounding properties of
recording situations to improve the MVDR beamformer’s enhancement.

6. Acknowledgements
   This research was supported by Digital Agriculture Cooperative. The author thanks our colleagues
from Digital Agriculture Cooperative, who provided insight and expertise that greatly assisted the
research.

7. References
[1] Dietzen T., Doclo S., Moonen M., Waterschoot T. Integrated Sidelobe Cancellation and Linear
   Prediction Kalman Filter for Joint Multi-Microphone Speech Dereverberation Interfering Speech
   Cancellation and Noise Reduction. IEEE/ACM Trans. Audio Speech Lang. Process, vol. 28, pp.
   740-754, 2020. DOI: 10.1109/TASLP.2020.2966869.
[2] Wei Wang W., Chen S., Wang R. A Fast Irregular Microphone Array Design Method Based on
   Acoustic Beamforming. IEEE Sensors Journal. DOI: 10.1109/JSEN.2023.3240888.
[3] Albertini D., Bernardini A., Borra F., Antonacci F., Sarti A. Two-Stage Beamforming With
   Arbitrary Planar Arrays of Differential Microphone Array Units. IEEE/ACM Transactions on
   Audio, Speech, and Language Processing. pp: 590 – 602, DOI: 10.1109/TASLP.2022.3231719.
[4] Yang W., Huang G., Zhang W., Chen J., Benesty J. Dereverberation with differential microphone
   arrays and the weighted-prediction-error method. 2018 16th International Workshop on Acoustic
   Signal Enhancement (IWAENC), pp. 376-380, 2018. DOI: 10.1109/IWAENC.2018.8521286.
[5] Xiao Y., Zhu S., Song W., Wan M., Gu J., Li T. Acoustic Beamforming via Interference-Plus-Noise
   Covariance Matrix Construction for Interferences and Noise Attenuation. 2022 IEEE International
   Conference on Robotics and Biomimetics (ROBIO). DOI: 10.1109/ROBIO55434.2022.10012011.
[6] Kagimoto Y., Itoyama K., Nishida K., Nakadai K. Spotforming by NMF Using
   Multiple Microphone Arrays. 2022 IEEE/RSJ International Conference on Intelligent Robots and
   Systems (IROS). DOI: 10.1109/IROS47612.2022.9981808.
[7] Kodrasi I., Doclo S. Joint Late Reverberation and Noise Power Spectral Density Estimation in a
   Spatially Homogeneous Noise Field // 2018 IEEE International Conference on Acoustics Speech
   and Signal Processing (ICASSP) IEEE, pp. 441-445, 2018. DOI: 10.1109/ICASSP.2018.8462142.
[8] Braun S. Evaluation and Comparison of Late Reverberation Power Spectral Density
   Estimators. IEEE/ACM Trans. Audio Speech Lang. Process, vol. 26, no. 6, pp. 1056-1071, June
   2018. DOI: 10.1109/TASLP.2018.2804172.
[9] Cheng R., Bao C., Cui Z. Mass: Microphone array speech simulator in room acoustic environment
   for multi-channel speech coding and enhancement. Applied Sciences, vol. 10, no. 4, pp. 1484, 2020.
   https://doi.org/10.3390/app10041484.
[10] Zhang Z., Xu Y., Yu M. Multi-Channel Multi-Frame ADL-MVDR for Target Speech Separation.
   IEEE/ACM Trans. Audio Speech and Language Processing, vol. 29, pp. 3526-3540, Nov.2021.
   https://doi.org/10.48550/arXiv.2012.13442.
[11] Tammen M., Doclo S. Deep Multi-Frame MVDR Filtering for Single-Microphone Speech
   Enhancement // Proc. IEEE International Conference on Acoustics Speech and Signal Processing
   (ICASSP), pp. 8443-8447, Jun. 2021.
[12] Fengqi T., Changchun B., Liu T. An Effective Dereverberation Algorithm by Fusing MVDR and
   MCLP // 2022 IEEE International Conference on Signal Processing, Communications and
   Computing (ICSPCC). DOI: 10.1109/ICSPCC55723.2022.9984583.
[13] Schreibman A., Hadad E., Barnov A., Tzirkel-Hancock E. Dual MVDR Architecture for Adaptive
   Cancellation of Dynamic Interference // 2022 30th European Signal Processing Conference
   (EUSIPCO). DOI: 10.23919/EUSIPCO55093.2022.9909959.
[14] Hadad E., Doclo S., Nordholm S., Gannot S. Pareto Optimal Binaural MVDR Beamformer with
   Controllable Interference Suppression // 2022 International Workshop on Acoustic Signal
   Enhancement (IWAENC). DOI: 10.1109/IWAENC53105.2022.9914759.
[15] Tammen M., Doclo S. Deep Multi-Frame MVDR Filtering for Binaural Noise Reduction // 2022
   International       Workshop      on       Acoustic      Signal    Enhancement       (IWAENC).
   DOI: 10.1109/IWAENC53105.2022.9914742.
[16] Piyushkumar       K.,    Shreya    S.,    Ankur    T.,    Hemant    A.        Robustness    of
   DAS Beamformer Over MVDR for Replay Attack Detection On Voice Assistants // 2022 IEEE
   International Conference on Signal Processing and Communications (SPCOM).
   DOI: 10.1109/SPCOM55316.2022.9840757.
[17] Alastair H., Hafezi S., Rebecca R., Patrick A., Brookes M. A Compact Noise Covariance Matrix
   Model for MVDR Beamforming. IEEE/ACM Transactions on Audio, Speech, and Language
   Processing. Pp: 2049 - 2061. DOI: 10.1109/TASLP.2022.3180671.
[18] https://labrosa.ee.columbia.edu/projects/snreval/
[19] Won K., Yeoum S., Kang B., Kim M., Yeji Shin Y., Hyunseung Choo H. Inaudible
   Transmission System with elective Dual Frequencies Robust to Noisy Surroundings. 2020 IEEE
   International          Conference         on         Consumer         Electronics       (ICCE).
   DOI: 10.1109/ICCE46568.2020.9042989.
[20] Zhou J., He S., Mo H., Tian X., Li Z.. A Modified Dual Microphone Adaptive Filter for
   Auscultation. 2019 IEEE 14th International Conference on Intelligent Systems and Knowledge
   Engineering (ISKE). DOI: 10.1109/ISKE47853.2019.9170433.
[21] Kotus J., Szwoch G. Localization of sound sources ith dual acoustic vector sensor. 2019 Signal
   Processing:      Algorithms,    Architectures,    Arrangements,    and    Applications   (SPA).
   DOI: 10.23919/SPA.2019.8936724.
[22] KimS.M. Hearing Aid Speech Enhancement Using Phase Difference-Controlled Dual-
   Microphone Generalized Sidelobe Canceller. IEEE Access. DOI: 10.1109/ACCESS.2019.2940047
[23] Tan K., Zhang X., Wang D.L. Real-time Speech Enhancement Using an Efficient Convolutional
   Recurrent Network for Dual-microphone Mobile Phones in Close-talk Scenarios. ICASSP 2019 -
   2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
   DOI: 10.1109/ICASSP.2019.8683385.
[24] Huang Y.A., Shabestary T.Z., Gruenstein A. Hotword Cleaner: Dual-microphone Adaptive Noise
   Cancellation with Deferred Filter Coefficients for Robust Keyword Spotting. ICASSP 2019 - 2019
   IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
   DOI: 10.1109/ICASSP.2019.8682682.
[25] Bagekar S., Tank V. Dual Channel Coherence Based Speech Enhancement with Wavelet
   Denoising. 2018 Second International Conference on Intelligent Computing and
   Control Systems (ICICCS). DOI: 10.1109/ICCONS.2018.8662885.
[26] Schwarz A., Kellermann W. Coherent-to-Diffuse Power Ratio Estimation for Dereverberation.
   Page(s): 1006 - 1018. IEEE/ACM Transactions on Audio, Speech, and Language
   Processing (Volume: 23, Issue: 6, June 2015). DOI: 10.1109/TASLP.2015.2418571.
[27] Jeub M., Schafer M., Esch T., Vary P. Model-based dereverberation preserving binaural cues.
   IEEE Trans. Audio, Speech, and Language Process., vol. 18, no. 7, pp. 1732–1745, 2010.
   DOI: 10.1109/TASL.2010.2052156.
[28] Pu Y., Butterfield D., Garcia J., Xie J., Lin M., Sauhta R., Farley R., Shellhammer S.,
Derkalousdian M., Newham A., Shi C., Shenoy R., Gousev E., Attar R. An Ultra-low-power 28nm
CMOS Dual-die ASIC Platform for Smart Hearables. 2018 IEEE Biomedical Circuits
and Systems Conference (BioCAS). DOI: 10.1109/BIOCAS.2018.8584806.