<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Indian Classical Raga Identification using Machine Learning</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Dipti Joshi</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Dr. Jyoti Pareek</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Pushkar Ambatkar</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Department of Computer Science, Gujarat University</institution>
          ,
          <addr-line>Ahmedabad, Gujarat</addr-line>
          ,
          <country country="IN">India</country>
        </aff>
      </contrib-group>
      <fpage>259</fpage>
      <lpage>263</lpage>
      <abstract>
        <p>Ragas demonstrate the pride of Indian classical music. Raga is the original musical form in Indian classical music. It consists of set of swaras (lyrical notes) that made up of various characteristics as a melodious conception which is played by the instruments and the singer. Based on the features of the raga, the Indian classical music is separated into two Parts: Hindustani (North Indian) and Carnatic (South Indian) classical music. Our experiment is concentrate on Hindustani classical music. In our experiment, K Nearest Neighbor (KNN) and Support vector machine (SVM) classifiers are used on the raga dataset of Yaman and Bhairavi to achieve classification and identification of the raga. We have done accurate outcomes with both KNN and SVM classifiers.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Raga identification</kwd>
        <kwd>Feature extraction</kwd>
        <kwd>Machine Learning</kwd>
        <kwd>KNN</kwd>
        <kwd>SVM</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>Indian classical music is the music of the Indian
subcontinent. Raga or Raag hold a prominent
position in Indian Classical Music. A Raag is a
collection of musical notes that, when sung or
performed on a musical instrument, are quite
attractive. Raga recognition comprises of methods
that define and classify notes from a piece of
music into a suitable raga. In Hindustani classical
music, Ragas is a very significant idea and express
the moods and sentiments of concert. The
classification of ragas comes only after an enough
amount of exposure as it is an intellectual process.1
Any of the attributes of ragas have to be translated
into appropriate characteristics for automated
recognition.
The characteristics of Ragas are based on Indian
Classical Music techniques, which blend notes
with the following features to qualify as a Raga.</p>
      <sec id="sec-1-1">
        <title>Notes (swaras)</title>
        <p>There have to be at least 5 or 7 notes (swaras) in a
Raag. The primary seven notes are S (Sa), R (Re
or Ri), G (Ga), M (Ma), P (Pa), D (Dha), N (Ni).</p>
      </sec>
      <sec id="sec-1-2">
        <title>Aaroh and Avroh</title>
        <p>Each Raga or Raag is composed of a "Aaroh" that
implies swaras scale up and a "Avroh" that implies
swaras scale down.</p>
      </sec>
      <sec id="sec-1-3">
        <title>Vadi and Samvadi</title>
        <p>Each raag consisting of "Vadi" means main notes
and "Samvadi" means supporting swaras.</p>
      </sec>
      <sec id="sec-1-4">
        <title>Gamakas</title>
        <p>It has a constant frequency rate. Notes in a raga
are a series of continuous (back and forth
movement in a rhythm) variation, such sort of
notes are known as Gamakas.</p>
      </sec>
      <sec id="sec-1-5">
        <title>Pakad</title>
        <p>A set of Swaras which are distinctively recognizes
a raga. There is a particular Pakad for each raga.</p>
      </sec>
      <sec id="sec-1-6">
        <title>Tala</title>
        <p>Tala refers to a rhythmic form, which is
constructed from variety of beats.</p>
      </sec>
      <sec id="sec-1-7">
        <title>Thaat</title>
        <p>Thaat is used in raga classification. There are
unique ten Thaats namely Kalyan, Bilawal,
Bhairav, Khamaj, Poorvi, Marwa, Kafi, Asawari,
Bhairavi, and Todi.</p>
        <p>
          The piece of music has to be converted to Swara
for classification. Due to the following factors,
there are several difficulties in converting the
piece of music in Swara [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ]
1. During any performance, a music part is made
up of many instruments.
2. The notes in Indian classical music are on a
relative scale.
3. In a raga, there is no static initial Swara.
4. In Indian music, the notes do not have a
predetermined frequency rate.
5. In classical music, the series of the swaras in
the particular ragas is not static as it allows
various innovations.
        </p>
        <p>The key purpose of raga recognition is that it will
provide a good start for Hindustani music
information retrieval and it allows us to predict the
raga's performance and accuracy. Besides this, for
music analysis, we can also create a playlist
focused on ragas.</p>
      </sec>
    </sec>
    <sec id="sec-2">
      <title>2. Related work</title>
      <p>
        In this section, we have reviewed different work
done by other authors and analyzed their work for
future scope. We have tried to give an analysis of
various classifiers, their relevance and
performance for Raga identification. In [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]
Sharma, Hiteshwari, Bali, Rasmeet S, Raga
identification have been done on the four ragas
like Des, Bhupali, Yaman, and Todi - dataset of
live performances of both voice-based and
instrumental, and executed identification using
pitch class profile and n-gram histogram machine
learning classifiers. For the pitch class profile,
they received 83.39% accuracy and 97.3% for the
n-gram histogram. In this paper [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] Ekta Patel and
Savita Chauhan, have used the MATLAB toolbox
for extracting track functions. A machine learning
tool WEKA is used which works on .arff file
format. Bayesian net, Naive Bayes, Support vector
machine (SVM), J48, Decision table, Random
forest classifiers on Bhairav, Yaman, Shanakara,
Saarang dataset. The predominant demanding
situations are the complicated variables like pitch
and mood in the music track, skipping greater
tones, the transformation of various dataset
parameters and Raag. The effects are as compared
before and after discretization, though in this Raag
music identification, the accuracy of the
possibility-based classifier is greater. It shows that
a probability-based classifier gave accurate results.
Comparatively, Bayesian Net provides better
performance. [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] Hiteshwari Sharma and R. S.
Bali have recognized various key variables for
raga classification and Soft computing fuzzy sets
technique for recognition of raga. They used a
dataset of five ragas like Des, Bhupali, Yaman,
Todi, and Pahadi with three parameters as time,
dirgaswaras, and vadi and they have achieved
reasonable accuracy as well. [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] G. Pandey, C.
Mishra, and Paul Ipe introduced the Hidden
Markov Model and pakad matching on the dataset
of two ragas Bhupali and Yaman kalian. They
have achieved an 77% accuracy with basic HMM
and 87% accuracy with both HMM and Pakad
matching methods. In [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] Muhammad Asim Ali
and Zain Ahmed Siddiqui, their research was
based on Automatic Music Genres Classification
using Machine Learning. For which they have
used algorithms like the K Nearest Neighbor
(KNN) and Support Vector Machine (SVM) to
anticipate the genre of songs. Using the GTZAN
dataset, which has a wide range of ten genres, such
as blues, hip-hop, jazz, classical, metal, reggae,
country, pop, disco, and rock, they gathered
musical data. They used the data set of 1000
songs. The above comparison shows that SVM is a
more efficient classifier than KNN. [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]
SnigdhaChillara, Kavitha A, Shwetha A Neginhal,
Shreya Haldia, Vidyullatha K, proposed to solve
the classification problem and comparison among
some other models using the Free Music Archive
small (fma_small) dataset. In that, two sorts of
inputs were given to the models. Wherein CNN
models used the spectrogram images and .csv file
for Logistic Regression and ANN model used
audio features stored in. They have received
88.5% accuracy using CNN on the spectrogram
based model which is quite good compared to
different algorithms used by other authors.
      </p>
    </sec>
    <sec id="sec-3">
      <title>3. Work Done</title>
      <p>In this section, we will discuss different
characteristics of audio and Machine Learning
algorithms like K Nearest Neighbor and Support
Vector Machine in a brief way.</p>
    </sec>
    <sec id="sec-4">
      <title>Feature Extraction</title>
      <p>Each audio signal comprises of several features.
But, it requires fetching the characteristics that are
suitable for the issue that we want to solve. The
method of fetching characteristics to apply for the
study is referred to as feature extraction. We will
have a brief idea about some of the characteristics
below, in detail.</p>
      <sec id="sec-4-1">
        <title>Power Spectrogram</title>
        <p>A spectrogram is a graphical demonstration of the
spectrum of frequencies of a signal as it differs
with time. When it is used with an audio signal,
spectrograms are sometimes referred to as the
sonographs, voiceprints or voicegrams. To
determine the raga, we are using the mean of
spectrogram to get which tone/ pitch is used more.</p>
      </sec>
      <sec id="sec-4-2">
        <title>MFCC- Mel-Frequency Cepstral Coefficients</title>
        <p>This feature is one of the most necessary
techniques to extract attributes of an audio signal
and it is used mostly when we are working on
audio signals. The mfccs of a signal are a set of
characteristics (approximately (10–20)) which in
brief illustrates the general form of a spectral
cover.</p>
      </sec>
      <sec id="sec-4-3">
        <title>Spectral Centroid</title>
        <p>It shows that the "center of mass" that considered
the weighted mean of the frequencies present
within the sound. If it gets the equal frequencies in
tune for a particular time span, then the spectral
centroid might be around the center and if there
are excessive frequencies at the end of the sound
then the centroid tends to be closer to its end.</p>
      </sec>
      <sec id="sec-4-4">
        <title>Zero-Crossing Rate</title>
        <p>The rate at which sign varies is known as the
zerocrossing rate. Zero crossing rate is the rate wherein
the signal varies from positive to negative and vice
versa. Speech recognition and music information
retrieval are being commonly used in Zero
crossing rates. It has excessive values for loud and
noisy sounds like in metal and rock.</p>
      </sec>
      <sec id="sec-4-5">
        <title>Roll-Off Frequency</title>
        <p>Particularly Roll-off suggests the activity of a
particular sort of channel; one planned to Roll-off
frequencies raised or lowered at a certain point. It
is called roll-off as the method is progressive.</p>
      </sec>
      <sec id="sec-4-6">
        <title>Spectral Bandwidth</title>
        <p>A radiated spectral quantity is not less than half its
maximum value in spectral bandwidth. It
determines the extent of the Spectrum.</p>
        <p>This is an interval difference between lower and
higher frequency.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Machine Learning Algorithms</title>
      <p>During our analysis, we have found that the
Supervised Machine Learning approach might be a
good fit for our problem.</p>
      <p>We have tried to implement various classification
algorithms and found that K Nearest Neighbor and
Support Vector Machine is quite appropriate for
our experiment.</p>
      <sec id="sec-5-1">
        <title>K Nearest Neighbor (KNN)</title>
        <p>K Nearest Neighbor (KNN) is a supervised
learning method. It is the simplest but robust
algorithm that is applied for both regression and
classification problems. To build a prediction, the
KNN algorithm uses the whole dataset in which
we attempt to classify data points to a particular
category with the help of the training set.
Support Vector Machine (SVM) Support
vector machine (SVM) is alsoa
supervised learning method that is usually used for
classification. A hyperplane that clearly divides
the sampling points with various labels is
identified by this algorithm. It separates sample
points of both labels and class on different sides of
the hyperplane.</p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>4. Dataset and Strategy</title>
      <p>In this experiment, we have used the dataset of
Audio files. The dataset was created by extracting
60-second audio clips from the internet. For music
and audio analysis, a python package Librosa is
used. It provides the segments that are required for
creating music information retrieval systems.
Another open-source machine learning library is
Scikit-learn which supports both, supervised and
unsupervised learning methods. It also provides a
variety of tools for model selection, data
preprocessing, model fitting, and estimation. In this
experiment, we have chosen Yaman and Bhairavi
Raga. We have split them into 60-sec frames
which allow the computer to work only on the
specific part of the song like Pakad, Aaroh, and
Avroh and remove other noise and empty fields
from the audio. Using Librosa Library, we have
created a .CSV file to save all the features like
Mfcc, Spectrogram, Bandwidth, Centroid,
zerocrossing, and Roll-off which are to be extracted
from the audio file. Below is the flow of the
process.</p>
      <p>Audio
Dataset</p>
      <p>Feature
Extraction</p>
      <p>Identification
and Classification
This is the visual representation of the
classification procedure in which features are
taken out from the audio frame and compared with
the weight of the closer mean. Using Scikit Learn
library, we have implemented KNN and SVM
algorithms on the .CSV data file. We found that
both KNN and SVM fit best for the classification
of raga other than Logistic Regression.</p>
    </sec>
    <sec id="sec-7">
      <title>5. Result and Discussion</title>
      <p>In our experiment, we have chosen Ragas like
Yaman and Bhairavi. We have used
vocalinstrument Dataset consisting of 341 audio clips,
out of which, 194 audio clips are of Yaman and
147 audio clips are of Bhairavi.</p>
      <p>The tables below show the accuracy of KNN and
SVM algorithms.</p>
      <sec id="sec-7-1">
        <title>Table.1 Results accuracy Using KNN of</title>
      </sec>
      <sec id="sec-7-2">
        <title>Raga</title>
      </sec>
      <sec id="sec-7-3">
        <title>Identification</title>
        <p>KNN</p>
        <sec id="sec-7-3-1">
          <title>Train/Test</title>
        </sec>
        <sec id="sec-7-3-2">
          <title>Ratio</title>
        </sec>
        <sec id="sec-7-3-3">
          <title>Classification Accuracy</title>
          <p>1
2
3
4
Train/ Test ratio into 3 different categories that are
80/20, 60/40, and 40/60 respectively. For all KNN
values, we got the highest accuracy in Train/Test
ratio 80/20 i.e. For KNN value 1 to 4; we received
98%, 97%, 97%, and 95% accuracy respectively.
Using SVM, we got 95% accuracy for the 80/20
Train/Test ratio.</p>
          <p>From this comparison, we can conclude that KNN
with all Neighbor values for Train/Test ratio 80/20
gives the highest accuracy which is better in
comparison with SVM.</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-8">
      <title>6. Conclusion and Future work</title>
      <p>A short introduction on raga and its attributes are
considered. Prior policies for raga classification
and recognition are observed with their data
records, implementation applications, correctness,
and problems. In this paper, we have discussed the
classification of different ragas like Yaman and
Bhairavi by applying K-Nearest-Neighbor (KNN),
Support vector machine (SVM) machine learning
algorithms. We have obtained good results with
KNN and SVM, but in our experiment, KNN
seems to be performing slightly better.</p>
      <p>In the future, we will expand our dataset with
many other Ragas for acquiring more accurate
results and also implement other classifiers for
detecting ragas in a well-defined manner.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>Vijay</given-names>
            <surname>Kumar</surname>
          </string-name>
          , Harit Pandya,
          <string-name>
            <given-names>C.V.</given-names>
            <surname>Jawahar</surname>
          </string-name>
          , “Identifying Ragas in Indian Music”,
          <source>22nd International Conference on Pattern Recognition</source>
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>H.</given-names>
            <surname>Sharma</surname>
          </string-name>
          and
          <string-name>
            <given-names>R. S.</given-names>
            <surname>Bali</surname>
          </string-name>
          , “
          <article-title>Comparison of ML classifiers for Raga recognition</article-title>
          ,
          <source>” Int. J. Sci. Res</source>
          . Publ., vol.
          <volume>5</volume>
          , no.
          <issue>10</issue>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>5</lpage>
          ,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>E.</given-names>
            <surname>Patel</surname>
          </string-name>
          and
          <string-name>
            <given-names>S.</given-names>
            <surname>Chauhan</surname>
          </string-name>
          , “
          <article-title>Raag detection in music using supervised machine learning approach,”</article-title>
          <string-name>
            <given-names>Int. J.</given-names>
            <surname>Adv</surname>
          </string-name>
          . Technol. Eng. Explor., vol.
          <volume>4</volume>
          , no.
          <issue>29</issue>
          , pp.
          <fpage>58</fpage>
          -
          <lpage>67</lpage>
          ,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>H.</given-names>
            <surname>Sharma</surname>
          </string-name>
          and
          <string-name>
            <given-names>R. S.</given-names>
            <surname>Bali</surname>
          </string-name>
          , “
          <article-title>Raga identification of Hindustani music using soft computing techniques,” 2014 Recent Adv</article-title>
          .
          <source>Eng. Comput. Sci. RAECS</source>
          <year>2014</year>
          , pp.
          <fpage>6</fpage>
          -
          <lpage>8</lpage>
          ,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>G.</given-names>
            <surname>Pandey</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Mishra</surname>
          </string-name>
          , and
          <string-name>
            <given-names>P.</given-names>
            <surname>Ipe</surname>
          </string-name>
          , “
          <article-title>Tansen: A system for automatic raga identification</article-title>
          ,
          <source>” Indian Int. Conf. Artif. Intell.</source>
          , pp.
          <fpage>1350</fpage>
          -
          <lpage>1363</lpage>
          ,
          <year>2003</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>Muhammad</given-names>
            <surname>Asim</surname>
          </string-name>
          <article-title>Ali and Zain Ahmed Siddiqui ,”Automatic Music Genres Classification using Machine Learning”</article-title>
          ,
          <source>International Journal of Advanced Computer Science and Applications</source>
          , Vol.
          <volume>8</volume>
          , No.
          <volume>8</volume>
          ,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>S.</given-names>
            <surname>Chillara</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. S.</given-names>
            <surname>Kavitha</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. A.</given-names>
            <surname>Neginhal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Haldia</surname>
          </string-name>
          , and
          <string-name>
            <given-names>K. S.</given-names>
            <surname>Vidyullatha</surname>
          </string-name>
          , “
          <article-title>Music Genre Classification using Machine Learning Algorithms : A comparison,” no</article-title>
          . May, pp.
          <fpage>851</fpage>
          -
          <lpage>858</lpage>
          ,
          <year>2019</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <surname>Kalyani</surname>
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Waghmare</surname>
          </string-name>
          and Balwant A. Sonkamble, “
          <article-title>Raga Identification Techniques for Classifying Indian Classical Music: A Survey”</article-title>
          ,
          <source>International Journal of Signal Processing Systems</source>
          Vol.
          <volume>5</volume>
          , No.
          <issue>4</issue>
          ,
          <string-name>
            <surname>December</surname>
          </string-name>
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>