Overview of the Mixed Script Information Retrieval (MSIR) at FIRE-2016 Somnath Banerjee Kunal Chakma Jadavpur University, India NIT Agartala, India sb.cse.ju@gmail.com kchax4377@gmail.com Sudip Kumar Naskar Amitava Das Paolo Rosso Jadavpur University, India IIIT Sri City, India Technical University of Valencia, Spain sudip.naskar@cse.jdvu.ac.in amitava.das@iiits.in prosso@dsic.upv.es Sivaji Bandyopadhyay Monojit Choudhury Jadavpur University, India Microsoft Research india sbandyopadhyay@cse.jdvu.ac.in monojitc@microsoft.com ABSTRACT search that commercial search engines have to tackle. Five The shared task on Mixed Script Information Retrieval (MSIR) teams participated in the shared task. was organized for the fourth year in FIRE-2016. The track In FIRE-2014, the scope of subtask-1 was extended to had two subtasks. Subtask-1 was on question classifica- cover three more South Indian languages - Tamil, Kannada tion where questions were in code mixed Bengali-English and Malayalam. In subtask-2, (a) queries in Devanagari and Bengali was written in transliterated Roman script. script, and (b) more natural queries with splitting and join- Subtask-2 was on ad-hoc retrieval of Hindi film song lyrics, ing of words, were introduced. More than 15 teams partici- movie reviews and astrology documents, where both the pated in the 2 subtasks [8]. queries and documents were in Hindi either written in De- Last year (FIRE-2015), the shared task was renamed from vanagari script or in Roman transliterated form. A total of “Transliterated Search” to “Mixed Script Information Re- 33 runs were submitted by 9 participating teams, of which 20 trieval (MSIR)” for aligning it to the framework proposed runs were for subtask-1 by 7 teams and 13 runs for subtask-2 by [9]. In FIRE-2015, three subtasks were conducted [16]. by 7 teams. The overview presents a comprehensive report Subtask-1 was extended further by including more Indic lan- of the subtasks, datasets and performances of the submitted guages, and transliterated text from all the languages were runs. mixed. Subtask-2 was on searching movie dialogues and reviews along with song lyrics. Mixed script question an- swering (MSQA) was introduced as subtask-3. A total of 1. INTRODUCTION 10 teams made 24 submissions for subtask-1 and subtask-2. A large number of languages, including Arabic, Russian, In spite of a significant number of registrations, no run was and most of the South and South East Asian languages like received in subtask-3. Bengali, Hindi etc., have their own indigenous scripts. How- This year, we hosted two subtasks in the MSIR shared ever, the websites and the user generated content (such as task. Subtask-1 was on classifying code-mixed cross-script tweets and blogs) in these languages are written using Ro- question; this task was the continuation of last year’s subtask- man script due to various socio-cultural and technological 3. Here Bengali words were written in Roman transliter- reasons[1]. This process of phonetically representing the ated Bengali. Here Bengali words were written in Roman words of a language in a non-native script is called translit- transliterated Bengali. The subtask-2 was on information eration. English being the most popular language of the retrieval of Hindi-English code-mixed tweets. The objective web, transliteration, especially into the Roman script, is of subtask-2 was to retrieve the top k tweets from a corpus used abundantly on the Web not only for documents, but [7] for a given query consisting of Hind-English terms where also for user queries that intend to search for these docu- the Hindi terms are written in Roman transliterated form. ments. This situation, where both documents and queries This paper provides the overview of the MSIR track in can be in more than one scripts, and the user expectation the Eighth Forum for Information Retrieval Conference 2016 could be to retrieve documents across scripts is referred to (FIRE-2016). The track was coordinated jointly by Mi- as Mixed Script Information Retrieval. crosoft Research India, Jadavpur University, Technical Uni- The MSIR shared task was introduced in 2013 as “Translit- versity of Valencia, IIIT Sriharikota and NIT Agartala. De- erated Search” at FIRE-2013 [13]. Two pilot subtasks on tails of these tasks can also be found on the website https: transliterated search were introduced as a part of the FIRE- //msir2016.github.io/. 2013 shared task on MSIR. Subtask-1 was on language iden- The rest of the paper is organized as follows. Section 2 tification of the query words and subsequent back translit- and 3 describe the datasets, present and analyze the run eration of the Indian language words. The subtask was submissions for the Subtask-1 and Subtask-2 respectively. conducted for three Indian languages - Hindi, Bengali and We conclude with a summary in Section 4. Gujarati. Subtask-2 was on ad hoc retrieval of Bollywood song lyrics - one of the most common forms of transliterated 2. SUBTASK-1: CODE-MIXED CROSS- Table 2: Subtask-1: Question class statistics SCRIPT QUESTION ANSWERING Class Training Testing Being a classic application of natural language process- Person (PER) 55 27 ing, question answering (QA) has practical applications in Location (LOC) 26 23 various domains such as education, health care, personal Organization (ORG) 67 24 assistance, etc. QA is a retrieval task which is more chal- Temporal(TEMP) 61 25 lenging than the task of common search engines because the Numerical(NUM) 45 26 purpose of QA is to find accurate and concise answer to Distance(DIST) 24 21 a question rather than just retrieving relevant documents Money(MNY) 26 16 containing the answer [10]. Recently, the code-mixed cross- Object(OBJ) 21 10 script QA research problem was formally introduced in [3]. Miscellaneous(MISC) 5 8 The first step of understanding a question is to perform ques- tion analysis. Question classification is an important task in question analysis which detects the answer type of the ques- tion. Question classification helps not only filter out a wide range of candidate answers but also determine answer selec- tion strategies [10]. Furthermore, it has been observed that the performance of question classification has significant in- fluence on the overall performance of a QA system. Let, Q = {q1 , q2 , . . . , qn } be a set of factoid questions as- sociated with domain D. Each question q : hw1 w2 w3 . . . wp i, is a set of words where p denotes the total number of words in a question. The words, w1 , w2 , w3 , . . . , wp , could be En- glish words or transliterated from Bengali in the code mixed scenario. Let C = {c1 , c2 , . . . , cm } be the set of question classes. Here n and m refer to the total number of questions Figure 1: Classwise distribution of dataset and question classes respectively. The objective of this subtask is to classify each given ques- tion q i ∈ Q into one of the predefined coarse-grained classes AMRITA CEN [2] team submitted 2 runs. They used cj ∈ C. For example, the question “last volvo bus kokhon bag-of-words (BoW) model for the Run-1. The Run-2 was chare?” (English gloss: “When does the last volvo bus de- based on Recurrent Neural Network (RNN). The initial em- part?”) should be classified to the class ‘TEMPORAL’. bedding vector was given to RNN and the output of RNN was fed to logistic regression for training. Overall, the BoW 2.1 Datasets model outperformed the RNN model by almost 7% ons F1- We prepared the datasets for subtask-1 from the dataset measure. described in [3] which is the only dataset available for code- AMRITA-CEN-NLP [4] team submitted 3 runs. They mixed cross-script question answering research. The dataset approached the problem using Vector Space Model (VSM). described in [3] contains questions, messages and answers Weighted term based on the context was applied to overcome from the sports and tourism domains in code-mixed cross- the shortcomings of VSM. The proposed approach achieved script English–Bengali. The dataset contains a total of 20 upto 80% accuracy in terms of F1-measure. documents from two domains, namely sports and tourism. ANUJ [15] also submitted 3 runs. The author used term There are 10 documents in the sports domain which consist frequency âĂŞ inverse document frequency (TF-IDF) vec- of 116 informal posts and 192 questions, while the 10 doc- tor as feature. A number of machine learning algorithms, uments in the tourism domain consist of 183 informal posts namely Support Vector Machines (SVM), Logistic Regres- and 314 questions. We initially provided 330 labeled factoid sion (LR), Random Forest (RF) and Gradient Boosting were questions as the development set to the participants after ac- applied using Grid Search to come up with the best param- cepting the data usage agreement. The testset contains 180 eters and model. The RF model performed the best among unlabeled factoid questions. Table 1 and Table 2 provide the 3 runs. statistics of the dataset. Question class specific distribution BITS PILANI [5] submitted 3 runs. Instead of apply- of the datasets is given in Figure 1. ing the classifiers on the code-mixed cross-script data, they convert the data into English. The translation was per- formed using Google translation API 1 . Then they applied Table 1: MSIR16 Subtask-1 Datasets Dataset Questions(Q) Total Words Avg. Words/Q three machine learning classifiers for each run, namely Gaus- Trainset 330 1776 5.321 sian Naı̈ve Bayes, LR and RF Classifier. However, Gaussian Testset 180 1138 6.322 Naive Bayes classifier outperformed the other two classifiers. IINTU [6] was the best performing team. The team sub- mitted 3 runs which were based on machine learning ap- 2.2 Submissions proaches. They trained three separate classifiers namely RF, One-vs-Rest and k-NN, followed by building an ensemble A total of 15 research teams registered for subtask-1. How- classifier using these 3 classifiers for the classification task. ever, only 7 teams submitted runs and a total of 20 runs The ensemble classifier took the output label by each of the were received. All the teams submitted 3 runs except AM- 1 RITA CEN who submitted 2 runs. https://translate.google.com/ individual classifiers and selected the majority label as out- accuracy. It can be observed from Table 3 that the highest put. In case of tie any one label was chosen at random as accuracy (83.333%) was achieved by the IINTU team. The output. classification performance on the temporal (TEMP) class NLP-NITMZ [12] submitted 3 runs of which 2 runs were was very high for almost all the teams. However, Table 4 and rule based - a first set of direct rules were applied for the Figure 2 suggest that the miscellaneous (MISC) question Run-1 while a second set of dependent rules were used for the class was very difficult to identify. Most of the teams could Run-3. A total of 39 rules were identified for the rule based not identify the MISC class. The reason could be very low runs. Naı̈ve Bayes classifier was used in Run-2 whereas presence(2%) of MISC class in the training dataset. Naı̈ve Bayes updateable classifier was used in Run-3. IIT(ISM)D used three different machine learning based 3. SUBTASK-2:INFORMATION RETRIEV- classification models - Sequential Minimal Optimization, Naı̈ve Bayes Multimodel and Decision Tree FT to annotate the AL ON CODE-MIXED HINDI-ENGLISH question text. This team submitted the runs after the dead- TWEETS line. This subtask is based on the concepts discussed in [9]. 2.3 Results In this subtask, the objective was to retrieve Code-Mixed Hindi-English tweets from a corpus for code-mixed queries. In this section, we define the evaluation metrics used to The Hindi components in both the tweets and the queries evaluate the runs submitted to the subtask-1. Typically, the are written in Roman transliterated form. This subtask performance of a question classifier is measured by calculat- did not consider cases where both Roman and Devanagari ing the accuracy of that classifier on a particular test set scripts are present. Therefore, the documents in this case are [10]. We also used this metric for evaluating the code-mixed tweets consisting of code-mixed Hindi-English texts where cross-script question classification performance. the Hindi terms are in Roman transliterated form. Given a number of correctly classified samples query consisting of Hindi and English terms written in Ro- accuracy = man script, the system has to retrieve the top-k documents total number of testset samples (i.e., tweets) from a corpus that contains Code-Mixed Hindi- In addition, we also computed the standard precision, re- English tweets. The expected output is a ranked list of the call and F1-measure to evaluate the class specific perfor- top twenty (k=20 here) tweets retrieved from the given cor- mances of the participating systems. The precision, recall pus. and F1-measure of a classifier on a particular class c are defined as follows: 3.1 Datasets number of samples correctly classified as c Initially we released 6,133 code-mixed Hindi-English tweets precision(P ) = number of samples classified as c with 23 queries as the training dataset. Later we released a document collection containing 2,796 code-mixed tweets number of samples correctly classified as c along with with 12 code-mixed queries as the testset. Query recall(R) = terms are mostly named entities with Roman transliterated total number of samples in class c Hindi words. The average length of the queries in the train- ing set is 3.43 words and in the testset it is 3.25 words. The 2.P.R tweets in the training set cover 10 topics whereas the testset F 1 − measure = P +R cover 3 topics. Table 3 presents the performance of the submitted runs in terms of accuracy. Class specific performances are reported 3.2 Submissions in Table 4. A baseline system was also developed for the This year total 7 teams have submitted 13 runs. The sake of comparison using the BoW which obtained 79.444% submitted runs for the retrieval task of Code-Mixed tweets mostly adopted preprocessing of the data and then applying different techniques for retrieving the desired tweets. Team Table 3: Subtask-1: Teams Performance Amrita CEN [4] removed some Hindi/English stop words to Team Run ID Correct Incorrect Accuracy declutter useless words. After that, they have tokenized all Baseline - 143 37 79.440 the tweets. The cosine distance was used to score the rele- AmritaCEN 1 145 35 80.556 AmritaCEN 2 133 47 73.889 vance of tweets to the query. After that, the top 20 tweets AMRITA-CEN-NLP 1 143 37 79.444 based on the scores were retrieved. Team CEN@Amrita[14] AMRITA-CEN-NLP 2 132 48 73.333 AMRITA-CEN-NLP 3 132 48 73.333 used a Vector Space Model based approach. Team UB [11] Anuj 1 139 41 77.222 has adopted three different techniques for the retrieval task. Anuj 2 146 34 81.111 First, they have used Named Entity boosts where the pur- Anuj 3 141 39 78.333 BITS PILANI 1 146 34 81.111 pose was to boost the documents based on their NE matches BITS PILANI 2 144 36 80.000 from the query, i.e., the query was parsed to extract NEs and BITS PILANI 3 131 49 72.778 IINTU 1 147 33 81.667 each document (tweet) that matched the given NE was pro- IINTU 2 150 30 83.333 vided a small numeric boost. At the second level of boost- IINTU 3 146 34 81.111 ing, phrase matching was done , i.e., documents that more NLP-NITMZ 1 134 46 74.444 NLP-NITMZ 2 134 46 74.444 closely matched the input query phrase were ranked higher NLP-NITMZ 3 142 38 78.889 than those that did not. The UB team used Synonym Expan- *IIT(ISM)D 1 144 36 80.000 *IIT(ISM)D 2 142 38 78.889 sion and Narrative based weighting as the second and third *IIT(ISM)D 3 144 36 80.000 techniques. Team NITA NITMZ performed stop words re- * denotes late submission moval followed by query segmentation and finally merging. Figure 2: Subtask-1: F-Measure of different teams for classes (Best run) Table 4: Subtask-1: Class specific performances (NA denotes no identification of a class) Team Run ID PER LOC ORG NUM TEMP MONEY DIST OBJ MISC AmritaCEN 1 0.8214 0.8182 0.5667 0.9286 1.0000 0.7742 0.9756 0.5714 NA AmritaCEN 2 0.7541 0.8095 0.6667 0.8125 1.0000 0.4615 0.8649 NA NA AMRITA-CEN-NLP 1 0.8000 0.8936 0.6032 0.8525 0.9796 0.7200 0.9500 0.5882 NA AMRITA-CEN-NLP 2 0.7500 0.7273 0.5507 0.8387 0.9434 0.5833 0.9756 0.1818 NA AMRITA-CEN-NLP 3 0.6939 0.8936 0.5455 0.8125 0.9804 0.6154 0.8333 0.3077 NA IINTU 1 0.7843 0.8571 0.6333 0.9286 1.0000 0.8125 0.9756 0.4615 NA IINTU 2 0.8077 0.8980 0.6552 0.9455 1.0000 0.8125 0.9756 0.5333 NA IINTU 3 0.7600 0.8571 0.5938 0.9455 1.0000 0.8571 0.9767 0.4615 NA NLP-NITMZ 1 0.7347 0.8444 0.5667 0.8387 0.9796 0.6154 0.9268 0.2857 0.1429 NLP-NITMZ 2 0.6190 0.8444 0.5667 0.9630 0.8000 0.7333 0.9756 0.4286 0.1429 NLP-NITMZ 3 0.8571 0.8163 0.7000 0.8966 0.9583 0.7407 0.9268 0.3333 0.2000 Anuj 1 0.7600 0.8936 0.6032 0.8125 0.9804 0.7200 0.8649 0.5333 NA Anuj 2 0.8163 0.8163 0.5538 0.9811 0.9796 0.9677 0.9500 0.5000 NA Anuj 3 0.8163 0.8936 0.5846 0.8254 1.0000 0.7200 0.8947 0.5333 NA BITS PILANI 1 0.7297 0.7442 0.7442 0.9600 0.9200 0.9412 0.9500 0.5000 0.2000 BITS PILANI 2 0.6753 0.7805 0.7273 0.9455 0.9600 1.0000 0.8947 0.4286 NA BITS PILANI 3 0.6190 0.7805 0.7179 0.8125 0.8936 0.9333 0.6452 0.5333 NA *IIT(ISM)D 1 0.7755 0.8936 0.6129 0.8966 0.9412 0.7692 0.9524 0.5882 NA *IIT(ISM)D 2 0.8400 0.8750 0.6780 0.8525 0.9091 0.6667 0.9500 0.1667 NA *IIT(ISM)D 3 0.8000 0.8936 0.6207 0.8667 1.0000 0.6923 0.9500 0.5333 NA Avg 0.7607 0.8415 0.6245 0.8858 0.9613 0.7568 0.9204 0.4458 NA Team IIT(ISM) D considered every tweet as a document subtask-2. MAP is also referred to as “average precision at and indexed using uniword indexing on Terrier implemen- seen relevant documents”. The idea is that, first, average tation. Query terms were expanded using soundex coding precision is computed for each query and subsequently the scheme. Terms with identical soundex code were selected average precisions are averaged over the queries. MAP is as candidate query and included in final queries to retrieve represented as the relevent tweets (documents). Further, they have used Qj N three different retrieval models BM25, DFR and TF-IDF to 1 X 1 X measure the similarity. However, this team submitted the M AP = P (doci ) N j=1 Qj i=1 runs after the deadline. where Qj refers to the number of relevant documents for 3.3 Results query j, N indicates the number of queries and P (doci ) rep- The retrieval task requires that the retrieved documents resents precision at the ith relevant document. at higher ranks be more important than the retrieved doc- The evaluation results of the submitted runs are reported uments at lower ranks for a given query and we want our in Table 5. The highest MAP (0.0377) was achieved by team measures to account for that. Therefore, set based evalu- Amrita@CEN which is still very low. The significantly low ation metrics such as Precision, Recall and F-measure are MAP values in Table 5 suggest that the task of retrieving not suitable for this task. Therefore, we used Mean Average Code-Mixed tweets against query terms comprising code- Precision (MAP) as the performance evaluation metric for mixed Hindi and English words is a difficult task and the techniques proposed by the teams do not produce satisfac- 6. REFERENCES tory results. Therefore, the problem of retrieving relevant [1] U. Z. Ahmed, K. Bali, M. Choudhury, and S. VB. code-mixed tweets requires better techniques and method- Challenges in designing input method editors for ologies to be developed for improving system performance. indian languages: The role of word-origin and context. Advances in Text Input Methods (WTIM 2011), pages Table 5: Results for Subtask-2 showing Mean Aver- 1–9, 2011. age Precision [2] M. Anand Kumar and K. P. Soman. Team Run ID MAP Amrita-CEN@MSIR-FIRE2016: Code-Mixed UB 1 0.0217 Question Classification using BoWs and RNN UB 2 0.016 Embeddings. In Working notes of FIRE 2016 - Forum UB 3 0.0152 for Information Retrieval Evaluation, Kolkata, India, Anuj 1 0.0209 December 7-10, 2016, CEUR Workshop Proceedings. Anuj 2 0.0199 CEUR-WS.org, 2016. Amrita CEN 1 0.0377 [3] S. Banerjee, S. K. Naskar, P. Rosso, and NLP-NITMZ 1 0.0203 S. Bandyopadhyay. The First Cross-Script NITA NITMZ 1 0.0047 Code-Mixed Question Answering Corpus. Proceedings CEN@Amrita 1 0.0315 of the workshop on Modeling, Learning and Mining for CEN@Amrita 2 0.0168 Cross/Multilinguality (MultiLingMine 2016), IIT(ISM)D 1 0.0021 co-located with The 38th European Conference on IIT(ISM)D 2 0.0083 Information Retrieval (ECIR), 2016. IIT(ISM)D 3 0.0021 [4] H. B. Barathi Ganesh, M. Anand Kumar, and K. P. Soman. Distributional Semantic Representation for Text Classification and Information Retrieval. In Working notes of FIRE 2016 - Forum for Information 4. SUMMARY Retrieval Evaluation, Kolkata, India, December 7-10, In this overview, we elaborated on the two subtasks of 2016, CEUR Workshop Proceedings. CEUR-WS.org, the MSIR-2016 at FIRE-2016. The overview is divided into 2016. two major parts one for each subtask, where the dataset, [5] R. Bhargava, S. Khandelwal, A. Bhatia, and evaluations metric and results are discussed in detail. A Y. Sharmai. Modeling Classifier for Code Mixed Cross total of 33 runs were submitted from 9 unique teams. Script Questions. In Working notes of FIRE 2016 - In subtask-1, 20 runs were received from 7 teams. The Forum for Information Retrieval Evaluation, Kolkata, best performing team achieved 83.333% accuracy. The av- India, December 7-10, 2016, CEUR Workshop erage question classification performance obtained in terms Proceedings. CEUR-WS.org, 2016. of accuracy was 78.19% which was quite satisfactory con- [6] D. Bhattacharjee and P. Bhattacharya. Ensemble sidering this new research problem. The subtask-1 deals Classifier based approach for Code-Mixed Cross-Script with code-mixed Bengali-English language. In the coming Question Classification. In Working notes of FIRE years, we would like to include more Indian languages. The 2016 - Forum for Information Retrieval Evaluation, participation was encouraging and we plan to continue the Kolkata, India, December 7-10, 2016, CEUR subtask-1 in subsequent FIRE conferences. Workshop Proceedings. CEUR-WS.org, 2016. Subtask-2 received a total of 13 run submissions from 7 [7] K. Chakma and A. Das. CMIR: A Corpus for teams out of which one team submitted after the deadline. Evaluation of Code Mixed Information Retrieval of The best MAP value achieved was 0.0377 which is consid- Hindi-English Tweets. In In the 17th International erably low. From the results of the run submissions it can Conference on Intelligent Text Processing and be inferred that information retrieval of code-mixed infor- Computational Linguistics (CICLING), April 2016. mal micro blog texts such as tweets is a very challenging [8] M. Choudhury, G. C. Gupta, P. Gupta, and A. Das. task. Therefore, the stated problem opens and calls for Overview of FIRE 2014 Track on Transliterated new avenues of research for developing better techniques and Search, 2014. methodologies. [9] P. Gupta, K. Bali, R. E. Banchs, M. Choudhury, and P. Rosso. Query expansion for mixed-script 5. ACKNOWLEDGMENTS information retrieval. In Proceedings of the 37th Somnath Banerjee, Sudip Kumar Naskar and Sivaji Bandy- international ACM SIGIR conference on Research & opadhyay acknowledge the support of the Ministry of Elec- development in information retrieval, pages 677–686. tronics and Information Technology (MeitY), Government ACM, 2014. of India, through the project “CLIA System Phase II”. [10] X. Li and D. Roth. Learning question classifiers. In The work of Paolo Rosso has been partially funded by Proceedings of the 19th international conference on SomEMBED MINECO TIN2015-71147-C2-1-P research project Computational linguistics-Volume 1, pages 1–7. and by the Generalitat Valenciana under the grant ALMA- Association for Computational Linguistics, 2002. MATER (PrometeoII/2014/030). [11] N. Londhe and R. K. Srihari. Exploiting Named Entity We would also like to thank everybody who helped spread Mentions Towards Code Mixed IR : Working Notes for awareness about this track in their respective capacities and the UB system submission for MSIR@FIRE’16. the entire FIRE team for giving us the opportunity and plat- [12] G. Majumder and P. Pakray. NLP-NITMZ @ MSIR form for conducting this new track smoothly. 2016 System for Code-Mixed Cross-Script Question Classification. In Working notes of FIRE 2016 - Forum for Information Retrieval Evaluation, Kolkata, India, December 7-10, 2016, CEUR Workshop Proceedings. CEUR-WS.org, 2016. [13] R. S. Roy, M. Choudhury, P. Majumder, and K. Agarwal. Overview and datasets of FIRE 2013 track on transliterated search. In Fifth Forum for Information Retrieval Evaluation, 2013. [14] S. Singh and Anand Kumar, M and Soman, KP. CEN@Amrita: Information Retrieval on CodeMixed Hindi-English Tweets Using Vector Space Models. In Working notes of FIRE 2016 - Forum for Information Retrieval Evaluation, Kolkata, India, December 7-10, 2016, CEUR Workshop Proceedings, December 2016. [15] A. Saini. Code Mixed Cross Script Question Classification. In Working notes of FIRE 2016 - Forum for Information Retrieval Evaluation, Kolkata, India, December 7-10, 2016, CEUR Workshop Proceedings. CEUR-WS.org, 2016. [16] R. Sequiera, M. Choudhury, P. Gupta, P. Rosso, S. Kumar, S. Banerjee, S. K. Naskar, S. Bandyopadhyay, G. Chittaranjan, A. Das, and K. Chakma. Overview of FIRE-2015 Shared Task on Mixed Script Information Retrieval.