=Paper=
{{Paper
|id=Vol-2786/Paper34
|storemode=property
|title=Indian Classical Raga Identification using Machine Learning
|pdfUrl=https://ceur-ws.org/Vol-2786/Paper34.pdf
|volume=Vol-2786
|authors=Dipti Joshi,Jyoti Pareek,Pushkar Ambatkar
|dblpUrl=https://dblp.org/rec/conf/isic2/JoshiPA21
}}
==Indian Classical Raga Identification using Machine Learning==
259
Indian Classical Raga Identification using Machine Learning
Dipti Joshi, Dr. Jyoti Pareek, Pushkar Ambatkar
Department of Computer Science, Gujarat University, Ahmedabad, Gujarat, India
Abstract
Ragas demonstrate the pride of Indian classical music. Raga is the original musical form in Indian
classical music. It consists of set of swaras (lyrical notes) that made up of various characteristics as a
melodious conception which is played by the instruments and the singer. Based on the features of the
raga, the Indian classical music is separated into two Parts: Hindustani (North Indian) and Carnatic
(South Indian) classical music. Our experiment is concentrate on Hindustani classical music. In our
experiment, K Nearest Neighbor (KNN) and Support vector machine (SVM) classifiers are used on
the raga dataset of Yaman and Bhairavi to achieve classification and identification of the raga. We
have done accurate outcomes with both KNN and SVM classifiers.
Keywords
Raga identification, Feature extraction, Machine Learning, KNN, SVM
1. Introduction The characteristics of Ragas are based on Indian
Classical Music techniques, which blend notes
Indian classical music is the music of the Indian with the following features to qualify as a Raga.
subcontinent. Raga or Raag hold a prominent
position in Indian Classical Music. A Raag is a Notes (swaras)
collection of musical notes that, when sung or There have to be at least 5 or 7 notes (swaras) in a
Raag. The primary seven notes are S (Sa), R (Re
performed on a musical instrument, are quite
or Ri), G (Ga), M (Ma), P (Pa), D (Dha), N (Ni).
attractive. Raga recognition comprises of methods Aaroh and Avroh
that define and classify notes from a piece of Each Raga or Raag is composed of a "Aaroh" that
music into a suitable raga. In Hindustani classical implies swaras scale up and a "Avroh" that implies
music, Ragas is a very significant idea and express swaras scale down.
the moods and sentiments of concert. The Vadi and Samvadi
classification of ragas comes only after an enough Each raag consisting of "Vadi" means main notes
amount of exposure as it is an intellectual process. 1 and "Samvadi" means supporting swaras.
Any of the attributes of ragas have to be translated Gamakas
into appropriate characteristics for automated It has a constant frequency rate. Notes in a raga
recognition. are a series of continuous (back and forth
movement in a rhythm) variation, such sort of
notes are known as Gamakas.
ISIC'21: International Semantic Intelligence Conference, February Pakad
25-27, 2021, New Delhi, India
joshidipti1408@gmail.com (D. Joshi) A set of Swaras which are distinctively recognizes
0000-0001-9166-4555 (D. Joshi) a raga. There is a particular Pakad for each raga.
©2021 Copyright for this paper by its authors.
Use permitted under Creative Commons License Tala
Attribution 4.0 International (CC BY 4.0). Tala refers to a rhythmic form, which is
CEUR Workshop Proceedings (CEUR-WS.org)
constructed from variety of beats.
260
Thaat for extracting track functions. A machine learning
Thaat is used in raga classification. There are tool WEKA is used which works on .arff file
unique ten Thaats namely Kalyan, Bilawal, format. Bayesian net, Naive Bayes, Support vector
Bhairav, Khamaj, Poorvi, Marwa, Kafi, Asawari, machine (SVM), J48, Decision table, Random
Bhairavi, and Todi. forest classifiers on Bhairav, Yaman, Shanakara,
The piece of music has to be converted to Swara Saarang dataset. The predominant demanding
for classification. Due to the following factors, situations are the complicated variables like pitch
there are several difficulties in converting the and mood in the music track, skipping greater
piece of music in Swara [1] tones, the transformation of various dataset
parameters and Raag. The effects are as compared
1. During any performance, a music part is made before and after discretization, though in this Raag
up of many instruments. music identification, the accuracy of the
2. The notes in Indian classical music are on a possibility-based classifier is greater. It shows that
relative scale. a probability-based classifier gave accurate results.
3. In a raga, there is no static initial Swara. Comparatively, Bayesian Net provides better
4. In Indian music, the notes do not have a performance. [4] Hiteshwari Sharma and R. S.
predetermined frequency rate. Bali have recognized various key variables for
5. In classical music, the series of the swaras in raga classification and Soft computing fuzzy sets
the particular ragas is not static as it allows technique for recognition of raga. They used a
various innovations. dataset of five ragas like Des, Bhupali, Yaman,
The key purpose of raga recognition is that it will Todi, and Pahadi with three parameters as time,
provide a good start for Hindustani music dirgaswaras, and vadi and they have achieved
information retrieval and it allows us to predict the reasonable accuracy as well. [5] G. Pandey, C.
raga's performance and accuracy. Besides this, for Mishra, and Paul Ipe introduced the Hidden
music analysis, we can also create a playlist Markov Model and pakad matching on the dataset
focused on ragas. of two ragas Bhupali and Yaman kalian. They
have achieved an 77% accuracy with basic HMM
2. Related work and 87% accuracy with both HMM and Pakad
matching methods. In [6] Muhammad Asim Ali
In this section, we have reviewed different work and Zain Ahmed Siddiqui, their research was
done by other authors and analyzed their work for based on Automatic Music Genres Classification
future scope. We have tried to give an analysis of using Machine Learning. For which they have
various classifiers, their relevance and used algorithms like the K Nearest Neighbor
performance for Raga identification. In [2] (KNN) and Support Vector Machine (SVM) to
Sharma, Hiteshwari, Bali, Rasmeet S, Raga anticipate the genre of songs. Using the GTZAN
identification have been done on the four ragas dataset, which has a wide range of ten genres, such
like Des, Bhupali, Yaman, and Todi - dataset of as blues, hip-hop, jazz, classical, metal, reggae,
live performances of both voice-based and country, pop, disco, and rock, they gathered
instrumental, and executed identification using musical data. They used the data set of 1000
pitch class profile and n-gram histogram machine songs. The above comparison shows that SVM is a
learning classifiers. For the pitch class profile, more efficient classifier than KNN. [7]
they received 83.39% accuracy and 97.3% for the SnigdhaChillara, Kavitha A, Shwetha A Neginhal,
n-gram histogram. In this paper [3] Ekta Patel and Shreya Haldia, Vidyullatha K, proposed to solve
Savita Chauhan, have used the MATLAB toolbox the classification problem and comparison among
261
some other models using the Free Music Archive centroid might be around the center and if there
small (fma_small) dataset. In that, two sorts of are excessive frequencies at the end of the sound
inputs were given to the models. Wherein CNN then the centroid tends to be closer to its end.
models used the spectrogram images and .csv file Zero-Crossing Rate
for Logistic Regression and ANN model used The rate at which sign varies is known as the zero-
audio features stored in. They have received crossing rate. Zero crossing rate is the rate wherein
88.5% accuracy using CNN on the spectrogram the signal varies from positive to negative and vice
based model which is quite good compared to versa. Speech recognition and music information
different algorithms used by other authors. retrieval are being commonly used in Zero
crossing rates. It has excessive values for loud and
3. Work Done noisy sounds like in metal and rock.
Roll-Off Frequency
In this section, we will discuss different Particularly Roll-off suggests the activity of a
characteristics of audio and Machine Learning
particular sort of channel; one planned to Roll-off
algorithms like K Nearest Neighbor and Support
Vector Machine in a brief way. frequencies raised or lowered at a certain point. It
is called roll-off as the method is progressive.
Feature Extraction Spectral Bandwidth
Each audio signal comprises of several features. A radiated spectral quantity is not less than half its
But, it requires fetching the characteristics that are maximum value in spectral bandwidth. It
suitable for the issue that we want to solve. The determines the extent of the Spectrum.
method of fetching characteristics to apply for the This is an interval difference between lower and
study is referred to as feature extraction. We will higher frequency.
have a brief idea about some of the characteristics
below, in detail.
Power Spectrogram Machine Learning Algorithms
A spectrogram is a graphical demonstration of the During our analysis, we have found that the
Supervised Machine Learning approach might be a
spectrum of frequencies of a signal as it differs good fit for our problem.
with time. When it is used with an audio signal, We have tried to implement various classification
spectrograms are sometimes referred to as the algorithms and found that K Nearest Neighbor and
sonographs, voiceprints or voicegrams. To Support Vector Machine is quite appropriate for
determine the raga, we are using the mean of our experiment.
spectrogram to get which tone/ pitch is used more. K Nearest Neighbor (KNN)
MFCC- Mel-Frequency Cepstral Coefficients K Nearest Neighbor (KNN) is a supervised
This feature is one of the most necessary learning method. It is the simplest but robust
techniques to extract attributes of an audio signal algorithm that is applied for both regression and
and it is used mostly when we are working on classification problems. To build a prediction, the
audio signals. The mfccs of a signal are a set of KNN algorithm uses the whole dataset in which
characteristics (approximately (10–20)) which in we attempt to classify data points to a particular
brief illustrates the general form of a spectral category with the help of the training set.
cover. Support Vector Machine (SVM) Support
Spectral Centroid vector machine (SVM) is also a
It shows that the "center of mass" that considered supervised learning method that is usually used for
the weighted mean of the frequencies present classification. A hyperplane that clearly divides
within the sound. If it gets the equal frequencies in the sampling points with various labels is
tune for a particular time span, then the spectral identified by this algorithm. It separates sample
262
points of both labels and class on different sides of 5. Result and Discussion
the hyperplane.
In our experiment, we have chosen Ragas like
4. Dataset and Strategy Yaman and Bhairavi. We have used vocal-
instrument Dataset consisting of 341 audio clips,
In this experiment, we have used the dataset of out of which, 194 audio clips are of Yaman and
Audio files. The dataset was created by extracting 147 audio clips are of Bhairavi.
60-second audio clips from the internet. For music
and audio analysis, a python package Librosa is The tables below show the accuracy of KNN and
used. It provides the segments that are required for SVM algorithms.
creating music information retrieval systems.
Another open-source machine learning library is Table.1 Results of Raga Identification
Scikit-learn which supports both, supervised and accuracy Using KNN
unsupervised learning methods. It also provides a
KNN Train/Test Classification
variety of tools for model selection, data pre-
Ratio Accuracy
processing, model fitting, and estimation. In this
experiment, we have chosen Yaman and Bhairavi
80 / 20 98%
Raga. We have split them into 60-sec frames
1 60 / 40 94%
which allow the computer to work only on the
specific part of the song like Pakad, Aaroh, and 40 / 60 93%
Avroh and remove other noise and empty fields 80 / 20 97%
from the audio. Using Librosa Library, we have 2 60 / 40 93%
created a .CSV file to save all the features like 40 / 60 92%
Mfcc, Spectrogram, Bandwidth, Centroid, zero- 80 / 20 97%
crossing, and Roll-off which are to be extracted 3 60 / 40 94%
from the audio file. Below is the flow of the 40 / 60 93%
process. 80 / 20 95%
4 60 / 40 93%
40 / 60 93%
Feature Identification
Audio Extraction and Classification
Dataset
Table.2 Results of Raga Identification accuracy
using SVM
Figure 1: Process of Raga Identification [8]
Classifier Train/Test Classification
This is the visual representation of the Ratio Accuracy
classification procedure in which features are 80/20 95%
taken out from the audio frame and compared with SVM 60/40 95%
the weight of the closer mean. Using Scikit Learn 40/60 94%
library, we have implemented KNN and SVM In both the tables given above, we have
algorithms on the .CSV data file. We found that implemented two different classifiers i.e. KNN
both KNN and SVM fit best for the classification and SVM. Using KNN, we can observe that the
of raga other than Logistic Regression. accuracy of raga identification is varied for
different Neighbour values. We have divided the
263
Train/ Test ratio into 3 different categories that are [4] H. Sharma and R. S. Bali, “Raga identification
80/20, 60/40, and 40/60 respectively. For all KNN of Hindustani music using soft computing
values, we got the highest accuracy in Train/Test techniques,” 2014 Recent Adv. Eng. Comput.
ratio 80/20 i.e. For KNN value 1 to 4; we received Sci. RAECS 2014, pp. 6–8, 2014.
98%, 97%, 97%, and 95% accuracy respectively. [5] G. Pandey, C. Mishra, and P. Ipe, “Tansen: A
Using SVM, we got 95% accuracy for the 80/20 system for automatic raga identification,”
Train/Test ratio. Indian Int. Conf. Artif. Intell., pp. 1350–1363,
2003.
From this comparison, we can conclude that KNN [6] Muhammad Asim Ali and Zain Ahmed
with all Neighbor values for Train/Test ratio 80/20 Siddiqui ,”Automatic Music Genres
gives the highest accuracy which is better in Classification using Machine Learning”,
comparison with SVM. International Journal of Advanced Computer
Science and Applications, Vol. 8, No. 8, 2017.
6. Conclusion and Future work [7] S. Chillara, A. S. Kavitha, S. A. Neginhal, S.
Haldia, and K. S. Vidyullatha, “Music Genre
A short introduction on raga and its attributes are Classification using Machine Learning
considered. Prior policies for raga classification Algorithms : A comparison,” no. May, pp.
and recognition are observed with their data 851–858, 2019.
records, implementation applications, correctness, [8] Kalyani C. Waghmare and Balwant A.
and problems. In this paper, we have discussed the Sonkamble, “Raga Identification Techniques
classification of different ragas like Yaman and
for Classifying Indian Classical Music: A
Bhairavi by applying K-Nearest-Neighbor (KNN),
Survey”, International Journal of Signal
Support vector machine (SVM) machine learning
Processing Systems Vol. 5, No. 4, December
algorithms. We have obtained good results with
2017.
KNN and SVM, but in our experiment, KNN
seems to be performing slightly better.
In the future, we will expand our dataset with
many other Ragas for acquiring more accurate
results and also implement other classifiers for
detecting ragas in a well-defined manner.
References
[1] Vijay Kumar, Harit Pandya, C.V. Jawahar,
“Identifying Ragas in Indian Music”, 22nd
International Conference on Pattern
Recognition 2014.
[2] H. Sharma and R. S. Bali, “Comparison of ML
classifiers for Raga recognition,” Int. J. Sci.
Res. Publ., vol. 5, no. 10, pp. 1–5, 2015.
[3] E. Patel and S. Chauhan, “Raag detection in
music using supervised machine learning
approach,” Int. J. Adv. Technol. Eng. Explor.,
vol. 4, no. 29, pp. 58–67, 2017.