<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Techniques for Recognising and Classifying Environmental Noise Using Deep Learning</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Ludovica Beritelli</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Maria Grazia Borzì</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Cristian Randieri</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Roberta Avanzato</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Francesco Beritelli</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Department of Electrical, Electronic and Computer Engineering University of Catania</institution>
          ,
          <addr-line>Catania</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Università degli Studi e-Campus</institution>
          ,
          <addr-line>Novedrate (CO)</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
      </contrib-group>
      <fpage>62</fpage>
      <lpage>67</lpage>
      <abstract>
        <p>Increasing urbanisation poses new challenges in mitigating noise pollution and preserving quality of life. In this study, we present an innovative approach for the classification of environmental noise, exploiting advanced Deep Learning (DL) techniques. By merging three diferent public datasets, we created a unified corpus to train and test a convolutional neural network (CNN), with the aim of eficiently recognising and classifying various noise events. The proposed approach overcomes the limitations of conventional methodologies, avoiding the need for data pre-processing that could alter sound characteristics. The experimental results demonstrate a significant improvement in classification accuracy, reaching 96.93% with the test set and 100% by applying a post-processing filter. These results emphasise the potential of DL in the treatment of environmental noise, ofering new perspectives for signal processing and telecommunications.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Environmental Noise Classification</kwd>
        <kwd>Convolutional Neural Networks</kwd>
        <kwd>Signal Processing</kwd>
        <kwd>Noise Pollution</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        The search for sustainable solutions to mitigate the impact
of environmental noise has become crucial to preserving
the quality of life in our increasingly urban society. In this
context, the recognition and classification of environmental
noise emerge as key challenges in the field of signal
processing and telecommunications, where noise can significantly
degrade the quality and intelligibility of transmitted signals
[
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. Recently, the advancement of machine learning (ML)
and deep learning (DL) techniques has opened new frontiers
in the accuracy of noise classification.
      </p>
      <p>
        Pioneering studies, such as that of Couvreur et al. [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ],
have demonstrated the efective use of hidden Markov
models (HMMs) for the recognition of sound events, ofering
detailed analysis of sound signals in time and frequency.
Despite their efectiveness, these techniques require
considerable computational resources, posing challenges in practical
implementation [
        <xref ref-type="bibr" rid="ref3 ref4 ref5 ref6 ref7 ref8 ref9">3, 4, 5, 6, 7, 8, 9</xref>
        ]. In parallel, the approach
by Alsouda et al. [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] presents a machine-learning-based
method for urban noise identification using an inexpensive
IoT unit and Mel-frequency cepstral coeficient extraction of
audio features and supervised classification algorithms (such
as support vector machine, k-nearest neighbours, bootstrap
aggregation and random forest). This approach achieved
noise classification accuracy in the range of 88% to 94%.
The integration of HMM, fuzzy logic and neural networks
proposed by Beritelli et al. [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ] further emphasised the
importance of combining diferent methodologies to improve
classification accuracy on large noise databases.
Furthermore, a study conducted by Aksoy et al. [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ] used advanced
deep learning models, including VGG-13BN, ResNet-50 and
DenseNet-121, to classify sounds according to their
environmental relevance. The results demonstrated high
accuracy in classifying sounds, with correctness rates of over
95%, highlighting in particular the VGG-13 BN model that
achieved 99.72% accuracy. These results underline the
significant potential of the deep learning approach in identifying
sounds harmful to the environment. Another contribution
is made by Jeon et al. [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ], proposing a multi-channel indoor
noise database for the development and evaluation of speech
processing algorithms. This database includes noise signals
generated by physical actions and loudspeakers placed in
various locations within an apartment building, allowing for
a wide range of noise conditions. A further study, conducted
by Ramli et al. [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ], proposes a mechanism to reduce
background noise in voice communications through the use of
a two-sensor adaptive noise canceller. This system
demonstrated high convergence rates, significant improvements in
the signal-to-noise ratio, and a 65% reduction in
computational power compared to traditional methods. The study by
Tsai et al. [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ] analyses the spatial characteristics of urban
noise using noise maps and emphasises the importance of
noise maps for a better understanding and management of
urban noise.
      </p>
      <p>This study demonstrates how the application of DL
techniques can ofer efective solutions to the challenges of
environmental noise classification, with potential significant
benefits for the telecommunication sector and society at
large. Our research opens new perspectives for the use of
artificial intelligence in urban noise mitigation, promoting
a more sustainable environment and a better quality of life.</p>
      <p>In section 2 we discuss the importance of developing
effective noise classification strategies, which are essential for
improving the quality of communication and, consequently,
the quality of life in urban areas.</p>
      <p>In section 3, we present our innovative approach, which
exploits advanced DL techniques for analysing and
classifying environmental noise. We will illustrate how, through
the use of Convolutional Neural Networks (CNNs), our
model works directly with the raw audio data, avoiding
the loss of significant information that could result from
pre-processing processes.</p>
      <p>
        In section 4, we present the results obtained from our
study, demonstrating the efectiveness of the proposed
model in classifying environmental noise. The results show
a significant improvement in classification accuracy,
achieving remarkable performance in the tests performed. We will
also discuss the impact of a post-processing filter [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ] in
further increasing the accuracy of the model.
      </p>
    </sec>
    <sec id="sec-2">
      <title>2. Environmental Noise</title>
      <p>
        Environmental noise, defined as any unwanted sound
generated by the surrounding environment, is a major source
of noise pollution. These sounds may come from natural
sources such as sea waves or from man-made sources such
as vehicle trafic, alarms, voices and electronic devices.
Effective management of such noise requires methods that
go beyond simply measuring sound pressure levels (dB),
including characterising and identifying the type of noise [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ].
In the field of telecommunications, environmental noise
introduces significant challenges, degrading signal quality
and compromising communication eficiency. Research has
highlighted the importance of developing advanced noise
reduction strategies, through the use of machine learning
(ML) and deep learning (DL) techniques, aimed at
improving the accuracy of noise classification [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ]. The studies in
[
        <xref ref-type="bibr" rid="ref19 ref20">19, 20</xref>
        ] have contributed greatly to the understanding of
environmental noise by providing innovative approaches
for its analysis and classification. These works emphasise
the need for authentic and versatile databases to test and
develop signal processing algorithms capable of handling
the complexity of real acoustic environments [
        <xref ref-type="bibr" rid="ref21">21</xref>
        ]. The
accurate identification and classification of environmental noise
not only improves the performance of telecommunication
systems but also contributes to the health and well-being
of individuals by reducing exposure to harmful levels of
noise. Therefore, research in this area is crucial to advance
the design of more resilient communication systems and to
promote a more sustainable sound environment.
      </p>
    </sec>
    <sec id="sec-3">
      <title>3. Method proposed</title>
      <p>Advances in Machine Learning (ML) and Deep Learning
(DL) techniques have radically transformed the approach
to data analysis, allowing us to discover unexpected
complex patterns in audio data. In this study, we adopted an
innovative methodology that exploits neural networks to
directly process audio signals in .wav format. The aim is to
evaluate the ability of these networks to accurately classify
diferent sound events without resorting to pre-processing
techniques that could compromise data integrity. In the
subsection 3.1 we will describe the datasets used, the
breakdown of these for training, validation and testing of the
neural network and in 3.2 the CNN network used for the
classification of ambient noise.</p>
      <sec id="sec-3-1">
        <title>3.1. Dataset</title>
        <p>
          The dataset used in this research was composed by
merging three distinct public databases: UrbanSound [
          <xref ref-type="bibr" rid="ref18">18</xref>
          ],
Demand [
          <xref ref-type="bibr" rid="ref17">17</xref>
          ] and Noisex-92 [
          <xref ref-type="bibr" rid="ref19">19</xref>
          ]. This fusion created a
heterogeneous dataset that includes a wide range of sound
classes, specifically excluding the dog bark class from
UrbanSound, but incorporating common ambient noise classes
from Noisex-92 and Demand. A prepocessing phase is
carried out before giving the data as input to the CNN network.
Specifically, the recordings were all divided into 2-second
sub-sequences and sampled at 22050 Hz. The dataset was
randomly divided into two diferent sets, one used for
network training and validation and the other for network
testing, ensuring an equal distribution of sound classes
between the two. The learning dataset includes classes such
as “air_conditioner", “children_playing", and “trafic", with a
variable number of sound sequences per class. Similarly, the
test dataset maintains a representative proportion of each
class, ensuring a valid evaluation of network performance.
3.1.1. Learning dataset
3.1.2. Testing dataset
• “air_conditioner": 1271 audio sequences,
• “children_playing": 704 audio sequences,
• “babble": 259 audio sequences,
• “car_horn": 307 audio sequences,
• “drilling": 622 audio sequences,
• “engine_idling": 704 audio sequences,
• “jackhammer": 917 audio sequences,
• “metro": 1800 audio sequences,
• “ofice": 1800 audio sequences,
• “river": 1800 audio sequences,
• “siren": 956 audio sequences,
• “square": 1800 audio sequences,
• “street_music": 850 audio sequences,
• “trafic": 1800 audio sequences.
• “air_conditioner": 543 audio sequences,
• “children_playing": 299 audio sequences,
• “babble": 15 audio sequences,
• “car_horn": 129 audio sequences,
• “drilling": 268 audio sequences,
• “engine_idling": 303 audio sequences,
• “jackhammer": 398 audio sequences,
• “metro": 600 audio sequences,
• “ofice": 600 audio sequences,
• “river": 600 audio sequences,
• “siren": 412 audio sequences,
• “square": 600 audio sequences,
• “street_music": 362 audio sequences,
• “trafic": 600 audio sequences.
        </p>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. Application of CNNs</title>
        <p>Artificial intelligence (AI) represents a vast and evolving
ifeld of study that aims to emulate human cognitive
capabilities through the development of autonomous hardware
and software systems. This ambition to reflect human
intelligence in machines has led to the development of
technologies capable of autonomous learning, adaptation, reasoning
and planning. At the heart of AI are advanced algorithms
and computational techniques, which make it possible to
replicate typically human behaviours, such as interaction
with the environment and decision-making. The
applications of artificial intelligence range in diferent fields, from
industrial to domestic, demonstrating its potential to
improve both the activities of businesses and public
administrations and the everyday lives of people.</p>
        <p>Convolutional Neural Networks (CNNs) stand out for
their efectiveness in analysing visual and sound data due
to their ability to identify complex patterns through the use
of convolutional filters.</p>
        <p>Our CNN architecture follows a structured model starting
with the input layer, proceeding through convolutional and
activation (ReLU) layers, pooling, and culminating in a fully
connected layer for final classification. This design allows
the network to process audio features from the simplest to
the most complex, facilitating deep and robust data learning.
The detailed configuration of convolutional, pooling, and
fully connected layers provides a powerful means to extract
and interpret sound features, making CNNs particularly
suitable for the recognition and classification of complex
sound events. Our research aims to demonstrate the
efectiveness of this approach in the field of acoustic analysis,
contributing significantly to the field of signal processing
and audio classification.</p>
        <p>
          The neural network used in this study is based on the
architecture of 1D convolutional neural networks and, in
particular, on the “M5 (0.5M)" model described in [
          <xref ref-type="bibr" rid="ref16">16</xref>
          ]. This
network consists of five layers, the first four of which are
convolutions (1D Convolution Layer, Batch Normalisation
Layer, ReLU Layer and Pooling Layer) and the last layer is
the output (Softmax).
        </p>
        <p>The neural network’s input is a vector containing
sequences of audio waveforms, each with a duration of  =
2 seconds. The CNN neural network determines the
index associated with one of  = 14 diferent classes
( = 1, ...,  ) using the LogSoftMax function. The
network is trained by feeding RAW sequences representing
diferent environmental noises.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Experimental Results</title>
      <p>The validation process of our approach was carried out
through a rigorous experiment involving the direct input
of raw audio data, in .wav format, into the convolutional
neural network (CNN). Below, we present a detailed analysis
of the performance obtained during the diferent phases of
training, validation and testing of the network.</p>
      <sec id="sec-4-1">
        <title>4.1. Training and Validation</title>
        <p>During the training phase, we observed a progressive
improvement in network performance, as illustrated in Fig. 1.
This graph shows an increase in accuracy and a decrease in
the loss function as the epochs progress, highlighting the
efectiveness of the learning process. The dataset was split
into a proportion of 70% for training and 30% for validation,
as illustrated in Section 3.1.</p>
        <p>Fig. 2 presents the confusion matrix obtained from the
validation of the model, providing a clear indication of its
classification capability across the diferent sound categories.</p>
      </sec>
      <sec id="sec-4-2">
        <title>4.2. Testing</title>
        <p>The efectiveness of the model was further verified through
testing on a separate dataset, achieving an impressive
accuracy of 96.93%. Fig. 3 illustrates the confusion matrix
for this phase, confirming the network’s high accuracy in
recognising environmental sounds.</p>
      </sec>
      <sec id="sec-4-3">
        <title>4.3. Post-Processing and Time Window</title>
      </sec>
      <sec id="sec-4-4">
        <title>Analysis</title>
        <p>
          The introduction of a post-processing filter, called the
“recurrence filter" [
          <xref ref-type="bibr" rid="ref16">16</xref>
          ], further improved the performance of
the model. As demonstrated in Fig. 4, the accuracy of the
system increases significantly by extending the analysis
time window. In particular, it can be seen that by extending
the analysis beyond 28 seconds, the accuracy reaches 100%.
        </p>
        <p>The results underline the efectiveness of our approach
based on the use of convolutional neural networks for
analysing environmental sound, highlighting the potential
of deep learning techniques in overcoming the challenges
of accurately recognising complex sound events.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. Conclusion</title>
      <p>This study introduced a new approach for the classification
of environmental noise, exploiting the potential of Deep
Learning techniques to address one of the most pressing
challenges in signal processing and telecommunications.
Through the use of a CNN trained on a unified dataset
derived from three diferent public sources, it is shown that
high accuracy in the classification of environmental noise
events can be achieved without the need for complex
preprocessing. The results obtained reveal a marked
improvement in classification accuracy, highlighting the
efectiveness of our model both in the testing phase and in the
application of post-processing techniques. These results not
only confirm the value of convolutional neural networks in
acoustic analysis, but also open the way for future research
to explore the applicability of such methods in broader
areas, including urban noise monitoring and the improvement
of telecommunication systems. In conclusion, our study
contributes significantly to the body of research on signal
processing, proposing an efective and eficient model for
the classification of ambient noise, with direct implications
for environmental sustainability and quality of life in urban
areas.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>F.</given-names>
            <surname>Beritelli</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Gallotta</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Rametta</surname>
          </string-name>
          ,
          <article-title>A dual streaming approach for speech quality enhancement of voip service over 3g networks</article-title>
          ,
          <source>in: 2013 18th International Conference on Digital Signal Processing (DSP)</source>
          , IEEE,
          <year>2013</year>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>5</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>C.</given-names>
            <surname>Couvreur</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Fontaine</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Gaunard</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C. G.</given-names>
            <surname>Mubikangiey</surname>
          </string-name>
          ,
          <article-title>Automatic classification of environmental noise events by hidden markov models</article-title>
          ,
          <source>Applied Acoustics</source>
          <volume>54</volume>
          (
          <year>1998</year>
          )
          <fpage>187</fpage>
          -
          <lpage>206</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>G.</given-names>
            <surname>Capizzi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Napoli</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Paternò</surname>
          </string-name>
          ,
          <article-title>An innovative hybrid neuro-wavelet method for reconstruction of missing data in astronomical photometric surveys</article-title>
          ,
          <source>in: Artiifcial Intelligence and Soft Computing: 11th International Conference, ICAISC</source>
          <year>2012</year>
          , Zakopane, Poland, April 29-May 3,
          <year>2012</year>
          , Proceedings,
          <source>Part I 11</source>
          , Springer,
          <year>2012</year>
          , pp.
          <fpage>21</fpage>
          -
          <lpage>29</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>N.</given-names>
            <surname>Brandizzi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Russo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Brociek</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Wajda</surname>
          </string-name>
          ,
          <article-title>First studies to apply the theory of mind theory to green and smart mobility by using gaussian area clustering</article-title>
          , volume
          <volume>3118</volume>
          ,
          <year>2021</year>
          , pp.
          <fpage>71</fpage>
          -
          <lpage>76</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>F.</given-names>
            <surname>Bonanno</surname>
          </string-name>
          , G. Capizzi,
          <string-name>
            <given-names>G. Lo</given-names>
            <surname>Sciuto</surname>
          </string-name>
          ,
          <article-title>A neuro waveletbased approach for short-term load forecasting in integrated generation systems</article-title>
          , in: 2013
          <source>International Conference on Clean Electrical Power (ICCEP)</source>
          , IEEE,
          <year>2013</year>
          , pp.
          <fpage>772</fpage>
          -
          <lpage>776</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>V.</given-names>
            <surname>Ponzi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Russo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Wajda</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Brociek</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Napoli</surname>
          </string-name>
          ,
          <article-title>Analysis pre and post covid-19 pandemic rorschach test data of using em algorithms and gmm models</article-title>
          , volume
          <volume>3360</volume>
          ,
          <year>2022</year>
          , pp.
          <fpage>55</fpage>
          -
          <lpage>63</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>G.</given-names>
            <surname>Capizzi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G. L.</given-names>
            <surname>Sciuto</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Napoli</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Woźniak</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Susi</surname>
          </string-name>
          ,
          <article-title>A spiking neural network-based long-term prediction system for biogas production</article-title>
          ,
          <source>Neural Networks</source>
          <volume>129</volume>
          (
          <year>2020</year>
          )
          <fpage>271</fpage>
          -
          <lpage>279</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>G.</given-names>
            <surname>De Magistris</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Romano</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Starczewski</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Napoli</surname>
          </string-name>
          ,
          <article-title>A novel dwt-based encoder for human pose estimation</article-title>
          , volume
          <volume>3360</volume>
          ,
          <year>2022</year>
          , pp.
          <fpage>33</fpage>
          -
          <lpage>40</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>F.</given-names>
            <surname>Bonanno</surname>
          </string-name>
          , G. Capizzi,
          <string-name>
            <given-names>G. L.</given-names>
            <surname>Sciuto</surname>
          </string-name>
          ,
          <string-name>
            <surname>C. Napoli,</surname>
          </string-name>
          <article-title>Wavelet recurrent neural network with semi-parametric input data preprocessing for micro-wind power forecasting in integrated generation systems</article-title>
          ,
          <year>2015</year>
          , pp.
          <fpage>602</fpage>
          -
          <lpage>609</lpage>
          . doi:
          <volume>10</volume>
          .1109/ICCEP.
          <year>2015</year>
          .
          <volume>7177554</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Alsouda</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Pllana</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Kurti</surname>
          </string-name>
          ,
          <article-title>Iot-based urban noise identification using machine learning: performance of svm, knn, bagging, and random forest</article-title>
          ,
          <source>in: Proceedings of the international conference on omni-layer intelligent systems</source>
          ,
          <year>2019</year>
          , pp.
          <fpage>62</fpage>
          -
          <lpage>67</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>F.</given-names>
            <surname>Beritelli</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Grasso</surname>
          </string-name>
          ,
          <article-title>A pattern recognition system for environmental sound classification based on mfccs and neural networks</article-title>
          ,
          <source>in: 2008 2nd International Conference on Signal Processing and Communication Systems</source>
          , IEEE,
          <year>2008</year>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>4</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>B.</given-names>
            <surname>Aksoy</surname>
          </string-name>
          , U. Uygar,
          <string-name>
            <given-names>G.</given-names>
            <surname>Karadağ</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. R.</given-names>
            <surname>Kaya</surname>
          </string-name>
          , Ö. Melek,
          <article-title>Classification of environmental sounds with deep learning</article-title>
          ,
          <source>Advances in Artificial Intelligence Research</source>
          <volume>2</volume>
          (
          <year>2022</year>
          )
          <fpage>20</fpage>
          -
          <lpage>28</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <surname>K. M. Jeon</surname>
            ,
            <given-names>N. K.</given-names>
          </string-name>
          <string-name>
            <surname>Kim</surname>
            ,
            <given-names>M. J.</given-names>
          </string-name>
          <string-name>
            <surname>Jo</surname>
            ,
            <given-names>H. K.</given-names>
          </string-name>
          <string-name>
            <surname>Kim</surname>
          </string-name>
          ,
          <article-title>Design of multi-channel indoor noise database for speech processing in noise</article-title>
          ,
          <source>in: 2017 20th Conference of the Oriental Chapter of the International Coordinating Committee on Speech Databases and Speech I/O Systems</source>
          and
          <string-name>
            <surname>Assessment (O-COCOSDA</surname>
            <given-names>)</given-names>
          </string-name>
          , IEEE,
          <year>2017</year>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>4</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <surname>R. M. Ramli</surname>
            ,
            <given-names>A. O. A.</given-names>
          </string-name>
          <string-name>
            <surname>Noor</surname>
            ,
            <given-names>S. Abdul</given-names>
          </string-name>
          <string-name>
            <surname>Samad</surname>
          </string-name>
          ,
          <article-title>Noise cancellation using selectable adaptive algorithm for speech in variable noise environment</article-title>
          ,
          <source>International Journal of Speech Technology</source>
          <volume>20</volume>
          (
          <year>2017</year>
          )
          <fpage>535</fpage>
          -
          <lpage>542</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <surname>K.-T. Tsai</surname>
          </string-name>
          ,
          <string-name>
            <surname>M.-D. Lin</surname>
            ,
            <given-names>Y.-H.</given-names>
          </string-name>
          <string-name>
            <surname>Chen</surname>
          </string-name>
          ,
          <article-title>Noise mapping in urban environments: A taiwan study</article-title>
          ,
          <source>Applied Acoustics</source>
          <volume>70</volume>
          (
          <year>2009</year>
          )
          <fpage>964</fpage>
          -
          <lpage>972</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>R.</given-names>
            <surname>Avanzato</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Beritelli</surname>
          </string-name>
          ,
          <article-title>Heart sound multiclass analysis based on raw data and convolutional neural network</article-title>
          ,
          <source>IEEE Sensors Letters</source>
          <volume>4</volume>
          (
          <year>2020</year>
          )
          <fpage>1</fpage>
          -
          <lpage>4</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>J.</given-names>
            <surname>Thiemann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Ito</surname>
          </string-name>
          ,
          <string-name>
            <surname>E. Vincent,</surname>
          </string-name>
          <article-title>The diverse environments multi-channel acoustic noise database (demand): A database of multichannel environmental noise recordings</article-title>
          ,
          <source>in: Proceedings of Meetings on Acoustics</source>
          , volume
          <volume>19</volume>
          ,
          <string-name>
            <given-names>AIP</given-names>
            <surname>Publishing</surname>
          </string-name>
          ,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>J.</given-names>
            <surname>Salamon</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Jacoby</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. P.</given-names>
            <surname>Bello</surname>
          </string-name>
          ,
          <article-title>A dataset and taxonomy for urban sound research</article-title>
          ,
          <source>in: Proceedings of the 22nd ACM international conference on Multimedia</source>
          ,
          <year>2014</year>
          , pp.
          <fpage>1041</fpage>
          -
          <lpage>1044</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>A.</given-names>
            <surname>Varga</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H. J.</given-names>
            <surname>Steeneken</surname>
          </string-name>
          ,
          <article-title>Assessment for automatic speech recognition: Ii. noisex-92: A database and an experiment to study the efect of additive noise on speech recognition systems</article-title>
          ,
          <source>Speech communication 12</source>
          (
          <year>1993</year>
          )
          <fpage>247</fpage>
          -
          <lpage>251</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <given-names>J.</given-names>
            <surname>Salamon</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. P.</given-names>
            <surname>Bello</surname>
          </string-name>
          ,
          <article-title>Unsupervised feature learning for urban sound classification</article-title>
          ,
          <source>in: 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)</source>
          , IEEE,
          <year>2015</year>
          , pp.
          <fpage>171</fpage>
          -
          <lpage>175</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <surname>K. J. Piczak</surname>
          </string-name>
          ,
          <article-title>Esc: Dataset for environmental sound classification</article-title>
          ,
          <source>in: Proceedings of the 23rd ACM international conference on Multimedia</source>
          ,
          <year>2015</year>
          , pp.
          <fpage>1015</fpage>
          -
          <lpage>1018</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>