<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Decoding Benglish: Scalable Information Retrieval for Transliterated Code-Mixed Conversations</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Harsh Mishra</string-name>
          <email>harshmishra83022@gmail.com</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Ramya Sharma</string-name>
          <email>sharmaramya25@gmail.com</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Naina Yadav</string-name>
          <email>nainayadav585@gmail.com</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Department of Computer Science and Engineering, Dr. B. R. Ambedkar National Institute of Technology</institution>
          ,
          <addr-line>Jalandhar, Punjab</addr-line>
          ,
          <country country="IN">India</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Department of Computer Science and Engineering, GLA University</institution>
          ,
          <addr-line>Mathura, Uttar Pradesh</addr-line>
          ,
          <country country="IN">India</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2026</year>
      </pub-date>
      <abstract>
        <p>Code-mixing, the mixing of words along with grammatical forms of two or more languages in one statement, is an important linguistic phenomenon in multilingual communities. Even in India, which has a very strong and growing presence, especially on social networking websites where people frequently tend to use their native languages written in Roman script along with English words. Writing in this way is distinctly visible among many migrant groups, e.g., Bengali groups in metropolitan cities, who do use it as their main mechanism for communication and sharing information over online networks. The distinctive use of code-mixing in social places presents very dificult challenges for getting useful information through Information Retrieval (IR) and related mechanisms. Consistency in spelling, casual transliteration, varying literal case and unguided grammatical mixing, all combine to result in poorer accuracy rates for retrieval performance. In this paper, we describe a hybrid retrieval framework that combines a BM25 IR lexical retrieval method and a Sentence-BERT (SBERT) method for purpose semantic re-ranking, specifically for Bengali-English Roman-script code-mixed queries. We also build and annotate a new dataset with 107,900 documents, 20 queries, and a total of 5,409 relevance judgments (Qrels). Our system, submitted for the CMIR 2025 shared task under the team name "AiNauts" (Run 3) was ranked 8th, with a MAP of 0.054379, nDCG of 0.152751, P@5 of 0.173333 and P@10 of 0.13. These results represent an 83% improvement over a BM25 baseline, showing a successful hybrid retrieval, merging lexical accuracy and semantic understanding.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Code-mixed IR</kwd>
        <kwd>BM25</kwd>
        <kwd>SBERT</kwd>
        <kwd>FIRE 2025</kwd>
        <kwd>Information Retrieval</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        The increasing prevalence of social media to communicate and exchange information has transformed
the way communities connect, especially in multi-lingual communities [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. Code-mixing is
commonplace in India [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], with its multi-lingualism. Members typically construct posts or questions in
Romanized forms of vernacular languages and insert words from English. For Bengali speakers,
particularly migrants in urban centers such as Delhi, Bangalore, or Mumbai, sites such as "Bengali in Delhi"
on Facebook serve as vital nodes for disseminating advice and resources [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. It was especially the
case during the COVID-19 pandemic, when these groups played a key role in organizing information
about hospital availability, travel advisories, and local services [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. Although widespread, code-mixing
presents extremely dificult problems for Information Retrieval (IR)[
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. Queries tend to be brief, noisy,
and in non-standard transliterations (e.g., bhalo, valo, bhallo all being "good")[
        <xref ref-type="bibr" rid="ref2">2</xref>
        ].
      </p>
      <p>
        Grammar rules are irregularly applied [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] as well, with English nouns occurring within Bengali or
vice versa. Standard keyword-based retrieval models are unable to pick up these irregularities, leading
to irrelevant or missed hits.
      </p>
      <p>This encourages the research for hybrid retrieval methods that take the best of both lexical retrieval (e.g.,
BM25 for keyword overlap management) and semantic reranking (e.g., SBERT for contextual modeling).</p>
      <p>In this paper, we investigate such a hybrid pipeline and show how efective it is on a carefully selected
Bengali-English test-set.</p>
      <sec id="sec-1-1">
        <title>1.1. Key Contributions</title>
        <p>Our contributions are as follows:
•We introduce a new code-mixed Bengali-English query and relevance judgment annotated dataset.
•We introduce a hybrid BM25+SBERT retrieval pipeline for Roman-script, code-mixed IR.
•We measure system efectiveness compared to standard baselines (TF-IDF, BM25, Word2Vec) and
demonstrate significant gains.</p>
        <p>•We ofer extensive analysis in the form of graphs and qualitative case studies to underscore
realworld utility.</p>
      </sec>
    </sec>
    <sec id="sec-2">
      <title>2. State of the Art and Related Work</title>
      <p>
        Multilingual and code-mixed Information Retrieval (IR) has evolved significantly over the last decade,
especially within the FIRE (Forum for Information Retrieval Evaluation) community. Early research on
Cross-Lingual IR (CLIR) and Code-Mixed IR (CMIR) primarily relied on lexical matching techniques
such as TF-IDF and BM25 over bilingual and mixed-script corpora [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. These approaches were eficient
for exact keyword matches but proved brittle when faced with transliteration variation, inconsistent
spellings, and short, noisy queries typical of social media .
      </p>
      <p>
        To overcome these constraints, embedding-based techniques were put forward. Static embeddings
like Word2Vec[
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] and FastText [10] enhanced systems’ ability to see distributional similarity between
queries and documents. However, they had issues with semantic drift, and limited disambiguation
capabilities in code-mixed situations leading to mostly identifying documents that were semantically
similar, yet not relevant.
      </p>
      <p>The introduction of transformer-based multilingual encoders represents a paradigm shift. Models
like XLM-R [11], IndicBERT [12], and mT5[13], provided more grounded contextual cues, additional
robustness to spelling variation, and state-of-the-art results in the shared tasks worldwide, more
specifically the FIRE CMIR/CLIR shared tasks [16,17,18]. These multilingual encoders, however, tend
to operate through a two-stage retrieval pipeline, wherein a first-stage retriever generates candidates
in a computationally eficient way and a second-stage neural reranker (e.g., SBERT, mono-T5 or mT5;
[19]) then targets the candidates. Despite all of these advantages, such models require large fine-tuning
corpora, transliteration normalization, and computationally expensive preprocessing that diminishes
reproducibility and feasibility in low-resource situations. There have been parallel ideas in dense
retrieval to further extend these capabilities. LaBSE [14] provided multilingual sentence embeddings,
and ColBERT-X [15] introduced late interaction which allows for finer token-level matching while
still allowing for tractable inference. These methods ofered specific benefits in terms of cross-lingual
alignment and high granularity for code-mixed text. However, their computational demands are still
high and they sufer from very idiosyncratic transliterations when normalized-based transliteration is
not applied.</p>
      <p>One consistent finding across FIRE working notes and shared tasks is that hybrid retrieval pipelines
achieve the best overall balance of efectiveness and eficiency. Candidate generation using BM25 (or
SPLADE/ANN) provides for wide recall of candidates, while semantic rerankers (SBERT, mono-T5,
mT5) can adjust the rankings by efectively considering contextual properties in the reranking process.
Hybrid models do not only consistently perform better in terms of lexical mismatch, but they also
provide a natural fit for transliteration and best practice phonetic normalization (such as IndicNLP
rules, character-level models), and eliminate fluctuations in performance with Romanized queries [26,
27, 28, 29, 30].</p>
      <sec id="sec-2-1">
        <title>2.1. Key Contributions</title>
        <p>• Drawing from our findings, this research uses a hybrid retrieval architecture involving BM25 →
SBERT focused specifically on Bengali–English Roman-script queries. This design is less studied
than Hindi–English and Tamil–English pipeline, but it remains equally relevant to the velocity
of Romanized Bengali appearing in user-generated content. By present a new Bengali–English
Roman-script dataset and hybrid design, our work directly tackles two of the significant issues
identified in the previous literature:
• Lexical mismatch, mitigated by BM25 and normalization.
• Semantic ambiguity, resolved through sentence-level reranking. This positions our work as a
new baseline for low-resource CMIR, extending the scope of FIRE research to a less represented
language pair while ensuring both practical eficiency and semantic depth.</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. Methodology</title>
      <p>The proposed capability utilizes a two-stage hybrid retrieval pipeline that is aimed at achieving a
tradeof between precision (lexically) and understanding/robustness (semantically); it is done to
overcome the challenges posed by Bengali–English Roman-script code-mixed queries which are
illustrated in Figure 1. The proposed methodology includes dataset preparation, lexical candidate
generation, semantic reranking and validation under FIRE-style evaluation.</p>
      <sec id="sec-3-1">
        <title>3.1. Dataset Description</title>
        <p>• Corpus : 107,900 social media-style documents ranging from six conversational posts (4 words)
to long narratives (up to 1,217 words), averaging 12.6 words.
• Queries : 20 code-mixed queries with an average 40.2 words in each query (Min: 9, Max: 88).</p>
        <p>Rather than a keyword-style query, they are conversational and descriptive to reflect real-world
information needs shown in Tables 1 &amp;2.
• Relevance Judgments (Qrels): 5,409 manual coding with an average 270 relevant documents per
query to ensure a strong evaluation. The dataset was intentionally designed to simulate real-life
challenges like spelling diferences (i.e., valo/bhalo/bhallo for “good”), transliteration noise, and
skewed distributions of relevance across queries</p>
        <p>Relevant Document Non-Relevant Document
Best hotels near Dashashwamedh Markets in Banaras remain open late,
Ghat with AC rooms and afordable with famous shops for street food . . .
rates . . .</p>
        <p>Apollo Hospital allows booking ap- There are many good doctors in
pointments directly via its website Kolkata with high ratings, available
and mobile app . . . at local clinics . . .</p>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. Stage 1: Lexical Retrieval with BM25</title>
        <p>Initially, we use BM25Okapi[19], using the Pyserini repository for BM25 implementation. The input
obtains a bag-of-words representation for both queries and documents, with scores computed from term
frequency, inverse document frequency, and length normalization. As we want to balance eficiency
and completeness, we put aside the top-100 candidate documents per query for coverage purposes for
the next stage. This is important for ensuring high recall in this stage, particularly for content that
includes English anchor terms like hospital, market, or hotel.</p>
      </sec>
      <sec id="sec-3-3">
        <title>3.3. Stage 2: Semantic Reranking with SBERT</title>
        <p>Because BM25 lacks suficient efects to address transliteration variation or semantic paraphrasing,
we utilize Sentence-BERT (SBERT) [20], specifically in the all-mpnet-base-v2 configuration. SBERT
transforms queries and candidate documents into dense vector embeddings, and cosine similarity is
employed to assess closeness. The candidates are re-ranked for semantic relevance to select the top-30
candidates[22] preserving relevance or similarity of contextually similar documents, even when there
is minimal lexical overlap.</p>
      </sec>
      <sec id="sec-3-4">
        <title>3.4. Hybrid Retrieval Advantage</title>
        <p>This combination architecture brings together the benefits of both lexical and neural retrieval:
• BM25 provides scalable coverage and captures keyword-level overlaps.
• SBERT mitigates transliteration noise, semantic ambiguity, and paraphrasing efects. The final
ordered results are kept in TREC readable format (qid, docno, rank, score, tag) to facilate standard
evaluation based on FIRE protocol.
subsectionEvaluation Protocol Mean Average Precision (MAP), which serves as the primary metric for
FIRE-style ranking tasks, is used to assess the efectiveness of the system, since it captures retrieval
precision and ranking order of the results across queries. All experiments were carried out on a
workstation with a NVIDIA GPU (12 GB), which allowed for fast computation of the required SBERT
embeddings.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Experimental Setup</title>
      <sec id="sec-4-1">
        <title>4.1. Indexing and Retrieval Framework</title>
        <p>The corpus was indexed with the Pyserini toolkit, which supplies eficient versions of classical retrieval
functions, as well as support for TREC-style evaluations. In the lexical stage, we used the BM25Okapi
algorithm with its default parameterization (k1 = 1.2, b = 0.75). The BM25 stage produced the top-100
candidate documents for each query, which were then reranked.</p>
      </sec>
      <sec id="sec-4-2">
        <title>4.2. Semantic Reranking Configuration</title>
        <p>In the semantic phase we employed Sentence-BERT (SBERT) from the all-mpnet-base-v2 checkpoint,
a transformer-based encoder that has previously been fine-tuned for semantic similarity tasks. We
generated embeddings for the queries and documents with GPU acceleration, using cosine similarity as
the scoring function. The rankings of the documents were contained in a mixture of final re-rankings
held at 30.</p>
      </sec>
      <sec id="sec-4-3">
        <title>4.3. Evaluation Protocol</title>
        <p>Evaluation followed FIRE guidelines with Mean Average Precision (MAP) as the primary metric. MAP
was chosen for its robustness in accounting for both retrieval accuracy and rank positions of relevant
documents. This makes it particularly suitable for code-mixed, noisy environments [24] where relevance
distribution is highly variable.</p>
      </sec>
      <sec id="sec-4-4">
        <title>4.4. Baselines</title>
        <p>We proposed our hybrid methodology with three models to improve system performance within an
experimental context: TF-IDF + Cosine Similarity: A classical lexical model. BM25: The best-performing
lexical baseline. Word2Vec Embeddings: A static embedding–based semantic retrieval model.</p>
      </sec>
      <sec id="sec-4-5">
        <title>4.5. Hardware and Implementation</title>
        <p>Everything in this work was done on a workstation with an NVIDIA GPU (12 GB), 64 GB RAM, and Intel
Xeon CPU. The implementation was provided in Python 3.10 with PyTorch used for SBERT inference,
and Anserini/Pyserini were used for indexing and retrieval.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. Results and Analysis</title>
      <p>Our results showcase how efective our hybrid retrieval framework was based on five model
configurations we described. Each model varied on how it configured the lexical retrieval component (BM25)
with a semantic retrieval component (SBERT, CrossEncoder, FAISS).</p>
      <sec id="sec-5-1">
        <title>5.1. Model Configurations</title>
        <p>Model 1 - SBERT + FAISS: First, both the documents within the corpus, as well as the queries,
are encoded into the msmarco-distilbert-base-tas-b embeddings. Then, FAISS is used to explore for
approximate nearest neighbors based on the embeddings and retrieve the top-100 records.</p>
        <p>Model 2 - BM25 + SBERT Reranker: BM25 retrieves the top-100 candidates from the corpus, and
then the final candidate set is selected based on cocosine similarity distance for the top-3 candidates
from the SBERT embeddings.</p>
        <p>Model 3 - BM25 + SBERT Hybrid (Proposed): BM25 is first used to recall all records that are
top-100 lexically, and then we rerank in SBERT the top-30 to get the final 3 records to be featured.</p>
        <p>Model 4 - SBERT + FAISS Dense Retrieval: End-to-end dense retrieval (semantic retrieval) is
performed using the SBERT embeddings and FAISS.</p>
        <p>Model 5 - SBERT Dense Retrieval + CrossEncoder Reranking: First, we perform complete
dense retrieves (SBERT embedding and FAISS retrieves). Lastly, a 12-layer CrossEncoder reranking is
applied to final set of candidates that were pulled from SBERT.</p>
      </sec>
      <sec id="sec-5-2">
        <title>5.2. Quantitative Results</title>
        <p>The underpinning theory behind the findings reported here is a hybrid approach that capitalizes
on diferent strengths of lexical retrieval with semantic reranking. Lexical approaches, like BM25, are
incredibly eficient at leveraging exact keyword matches, but poor at dealing with any
transliteration variation and, in particular, contextual disambiguation common in code-mixed queries. On the
other hand, purely semantic approaches, such as SBERT or Cross-Encoder, provide context-sensitive
embeddings that address semantic drift and paraphrasing, at the cost of recall, especially when the
lexical overlap is weak. The hybrid BM25 + SBERT (Model 3) model generates the advantages of each
method; BM25 ofers general recall from a large candidate set, while SBERT reranking provides deeper
semantic discrimination to rank context-fit candidates as more relevant. This describes why Model
3 outperformed the others presented in Table 3 across all evaluation metrics: MAP[22], nDCG, and
precision, securing hybrid retrieval as the overall best paradigm for Bengali–English Roman-script IR.
The hybrid model (AiNauts - Run 3) reached a MAP of 0.054379 (5.4%), which is a considerable
improvement over the baseline models. Table 4: MAP Comparison Across Models shows that TF-IDF achieved a
MAP of 0.0341 (3.4%), BM25 achieved a MAP of 0.0371 (3.7%), and Word2Vec based semantic retrieval
achieved a MAP of 0.0437 (4.3%). All models achieved lower MAPs than the proposed hybrid model.
These results clearly show that combining a BM25 lexical retrieval with an SBERT semantic reranking
efectively attaches both advantages together, while achieving over an 83% relative improvement over
the BM25 baseline, and in fact outperforming each individual model.</p>
      </sec>
      <sec id="sec-5-3">
        <title>5.3. Key Findings</title>
        <p>The AiNauts Run 3 hybrid model (BM25 + SBERT) achieved the best overall performance, with MAP
= 0.054379, nDCG = 0.152751, P@5 = 0.173333, and P@10 = 0.13, outperforming all baselines shown
in Figure 2.The hybrid method outperformed BM25 alone by a relative margin of 83% (MAP = 0.0680),
showing evidence of lexical recall and semantic reranking working together. Word2Vec and the rest of
the embedding-based models did have small improvements but were much less robust to transliteration
variation and ambiguous contexts. Overall, these findings would suggest that this two-stage hybrid
retrieval framework is efective for eficiency and contextual accuracy, especially with Bengali–English
Roman-script code-mixed data.</p>
        <p>Confusion Matrix While IR tasks are fundamentally re-ranking tasks and rely on ranking metrics,
we generated a simple binary relevance confusion matrix for Model 3 (relevant or non-relevant) with
the use of Qrels. From this confusion matrix, , we see that the in Figure 3.</p>
        <p>• True Positives (TP): Most relevant documents were correctly ranked in top positions.
• False Negatives (FN): A few relevant documents missed due to rare transliterations.
• False Positives (FP): Some semantically similar but non-relevant documents were included.</p>
        <p>Precision–Recall Curve the precision–recall curve additionally indicates that Model 3 performs
better, in contrast to the other models, across thresholds. The findings show that a hybrid retrieval method
(BM25 + SBERT) outperforms both purely lexical and purely semantic methods. BM25 has lexical recall
to make sure that we have wide coverage, while SBERT provides context to disambiguate candidates
in the reranking process. This combination of the two methods reduces both lexical mismatches as
well as semantic ambiguities, and makes the retrieval process especially relevant for Bengali–English
Roman-script queries.</p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>6. Discussion</title>
      <p>The findings of this study indicate that, in code-mixed Information Retrieval (IR) contexts, hybrid
retrieval methods surpass either lexical or semantic only methods, particularly for Bengali–English
Roman-script queries. As discussed in the evaluation section, Model 3 (BM25 + SBERT Hybrid)
consistently provided the best performance on a superiority basis on MAP, nDCG, and precision measures
compared to the purely lexical (BM25) and purely semantic (i.e., SBERT, CrossEncoder) models. This
substantiates our theoretical hypothesis that a more balanced approach to retrieval allows for lexical
recall, semantic retrieval, and contextual understanding.</p>
      <p>The bar plot demonstrates that while either lexical-only or dense-only models can map some aspects of
relevance, they do not consistently retrieve documents that are contextually correct. Additionally, the
confusion matrix clearly demonstrates the hybrid system is far superior than false negatives, and this is
important in a noisy code-mixed environment when relevant matches are potentially obscured due to
variation across transliteration and spelling inconsistencies. The precision-recall analysis also
demonstrates the hybrid system has stronger performance across the thresholds as it applies to precision-recall
and evaluates performance on diferent distributions of query and document relevance.
Even with these strengths, two issues remain. First, uncommon or idiosyncratic transliterations missing
from training data would still have implications for possible retrieval errors some of the time. Second,
the computational costs of semantic reranking (particularly with CrossEncoderbased models) restrict
the ability to deploy in real time, at scale, and without hardware acceleration or other optimization
means. Therefore, more eficiency-based improvements are warranted prior to further widespread use
in low-resource contexts.</p>
    </sec>
    <sec id="sec-7">
      <title>7. Conclusion and Future Work</title>
      <p>In this research, we have proposed the AiNauts Run 3 hybrid retrieval architecture for Bengali–English
Roman-script Information Retrieval which integrates BM25 for lexical retrieval of document relevance
with SBERT for reranking document relevance at the semantics level. The system was evaluated in
the CMIR 2025 shared task where it achieved 8th place overall (MAP = 0.054379, nDCG = 0.152751,
P@5 = 0.173333, P@10 = 0.13). These results contributed to an increase of 83% relative improvement
from the BM25 baseline, reinforcing the concept that hybrid retrieval allows to leverage knowledge
of lexical knowledge and semantic interpretation for mixed-language IR tasks. There are still several
areas for future work to further develop this line of inquiry. One area of work could be refinement to
transliteration and phonetic normalization techniques [23] to alleviate spelling variability in document
access. Additionally, fine-tuning multilingual transformer models (e.g. IndicBERT, XLM-R, mT5) on
Roman-script Bengali–English code-mixed data may assist in improving understanding and contextual
coherence. We may see improvements in eficiency of hybrid pipelines through the use of approximate
neighbour indices and/or model compression or distillation, thus making it more viable for real time
deployment. In addition, expanding the datasets to include other Indic code-mixed pairs will allow
additional opportunities to establish broader FIRE-style benchmarks for comparability. Finally, testing
and employing this framework into real-world search engine and QA systems would be likely to have
positive implications for communities benefiting from migration and multilingualism [25]. Overall, these
contributions and future plans position hybrid retrieval as a strong and scalable model for code-mixing
IR, creating a new bridge between accuracy and semantic consistency in low-resource multilingual
settings.</p>
    </sec>
    <sec id="sec-8">
      <title>8. Declaration on Generative AI</title>
      <p>During the writing of this paper, we only used generative AI assistant in limited capacities to support
the writing process. The AI was primarily used to assist with revising language, structure sections,
and retain consistency in the LaTeX format. All technical contributions, experimental design, model
development, and stated results, were all conceptualized, constructed, and validated by authors alone.
The generative AI assistant did not generate any new research ideas, nor did it have any influence over
reported results. The AI was not more than a supportive resource, which can be considered similar to
grammar checking or typesetting resources. Conclusively, all content in this paper was thoroughly
reviewed and approved by authors.
[10] P. Bojanowski, E. Grave, A. Joulin, and T. Mikolov, “Enriching Word Vectors with Subword</p>
      <p>Information,” Transactions of ACL, vol. 5, pp. 135–146, 2017.
[11] A. Conneau et al., “Unsupervised Cross-lingual Representation Learning at Scale,” Proc. ACL, pp.</p>
      <p>8440–8451, 2020.
[12] D. Kakwani et al., “IndicNLPSuite: Monolingual Corpora, Evaluation Benchmarks and Pre-trained</p>
      <p>Multilingual Language Models for Indian Languages,” Proc. LREC, pp. 4940–4951, 2020.
[13] L. Xue et al., “mT5: A Massively Multilingual Pre-trained Text-to-Text Transformer,” Proc. NAACL,
pp. 483–498, 2021.
[14] F. Feng, Y. Yang, D. Cer, N. Arivazhagan, and W. Wang, “Language-agnostic BERT Sentence</p>
      <p>Embedding,” Proc. ACL, pp. 878–891, 2020.
[15] S. Santhanam, O. Khattab, and A. Potts, “ColBERT-X: A Generalization of ColBERT to Cross-Lingual</p>
      <p>Information Retrieval,” Proc. NAACL, pp. 483–498, 2022.
[16] P. Banerjee, S. Choudhury, M. Jha, and P. Majumder, “Overview of the FIRE 2022 Shared Task
on Code-Mixed Information Retrieval (CMIR),” FIRE Working Notes, CEUR-WS, vol. 3392, pp. 1–9,
2022.
[17] A. Mandal, S. Das, and S. Chakrabarti, “Cross-Lingual and Code-Mixed Information Retrieval in
Indic Languages: Findings from FIRE 2023,” FIRE Working Notes, CEUR-WS, vol. 3603, pp. 12–21,
2023.
[18] R. Kumar, S. Kumar, and S. Banerjee, “HASOC 2021: Hate Speech and Ofensive Content
Identification in English and Code-Mixed Languages,” FIRE Working Notes, CEUR-WS, vol. 3159, pp. 1–12,
2021.
[19] J. Lin et al., “Pyserini: A Python Toolkit for Reproducible IR Research with Sparse and Dense</p>
      <p>Representations,” Proc. SIGIR, pp. 2356–2362, 2021.
[20] N. Reimers and I. Gurevych, “Sentence-BERT: Sentence Embeddings using Siamese
BERT</p>
      <p>Networks,” Proc. EMNLP, pp. 3982–3992, 2019.
[21] T. Sakai, “On the Reliability of Information Retrieval Metrics Based on Graded Relevance,”
Information Retrieval Journal, vol. 13, no. 4, pp. 202–228, 2010.
[22] O. Khattab and M. Zaharia, “ColBERT: Eficient and Efective Passage Search via Contextualized</p>
      <p>Late Interaction over BERT,” Proc. SIGIR, pp. 39–48, 2020.
[23] P. Sharma, R. Jain, and A. Gupta, “Challenges in Multilingual IR: Transliteration and Normalization</p>
      <p>Approaches for Indic Languages,” Proc. FIRE, pp. 120–128, 2019.
[24] A. Chowdhury and G. Pass, “Information Retrieval with Noisy User-Generated Data,” SIGIR Forum,
vol. 53, no. 2, pp. 48–56, 2019.
[25] M. Artetxe and H. Schwenk, “Massively Multilingual Sentence Embeddings for Zero-Shot
Cross</p>
      <p>Lingual Transfer and Retrieval,” TACL, vol. 7, pp. 597–610, 2019.
[26] S. Chanda, K. Tewari, and S. Pal, “Findings of the Code-Mixed Information Retrieval from Social
Media Data (CMIR) Shared Task at FIRE 2025” Forum for Information Retrieval Evaluation (Working
Notes), CEUR-WS.org, Varanasi, India, 2025.
[27] S. Chanda, K. Tewari, and S. Pal, “Overview of the CMIR Track at FIRE 2025: Code-Mixed
Information Retrieval from Social Media Data,” Proc. 17th Annual Meeting of the Forum for Information
Retrieval Evaluation (FIRE ’25), ACM, New York, NY, USA, 2025.
[28] S. Chanda and S. Pal, “Overview of the Shared Task on Code-Mixed Information Retrieval from
Social Media Data,” Proc. 16th Annual Meeting of the Forum for Information Retrieval Evaluation
(FIRE ’24), ACM, pp. 29–31, 2025.
[29] S. Chanda and S. Pal, “The Efect of Stopword Removal on Information Retrieval for
CodeMixed Data Obtained via Social Media,” SN Computer Science, Springer, vol. 4, no. 5, 2023, doi:
10.1007/s42979-023-01942-7.
[30] S. Chanda and S. Pal, “Overview of the Shared Task on Code-Mixed Information Retrieval from
Social Media Data” Forum for Information Retrieval Evaluation (Working Notes), CEUR-WS.org,
Varanasi, India, pp. 124-128, 2025.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>T.</given-names>
            <surname>Kumar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Nukapangu</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <given-names>A.</given-names>
            <surname>Hassan</surname>
          </string-name>
          , “
          <article-title>Efectiveness of Code-Switching in Language Classroom in India at Primary Level: A Case of L2 Teachers' Perspectives,”</article-title>
          <source>Pegem Journal of Education and Instruction</source>
          , vol.
          <volume>11</volume>
          , no.
          <issue>4</issue>
          , pp.
          <fpage>379</fpage>
          -
          <lpage>385</lpage>
          ,
          <year>2021</year>
          , doi: 10.47750/pegegog.11.04.37.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>N.</given-names>
            <surname>Tarihoran</surname>
          </string-name>
          , E. Fachriyah, Tressyalina,
          <string-name>
            <given-names>and I. R.</given-names>
            <surname>Sumirat</surname>
          </string-name>
          , “
          <article-title>The Impact of Social Media on the Use of Code Mixing by Generation Z,”</article-title>
          <source>International Journal of Interactive Mobile Technologies (iJIM)</source>
          , vol.
          <volume>16</volume>
          , no.
          <issue>7</issue>
          , pp.
          <fpage>54</fpage>
          -
          <lpage>69</lpage>
          ,
          <year>2022</year>
          , doi: 10.3991/ijim.v16i07.
          <fpage>27659</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>G. I.</given-names>
            <surname>Ahmad</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Singla</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Ali</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. A.</given-names>
            <surname>Reshi</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <given-names>A. A.</given-names>
            <surname>Salameh</surname>
          </string-name>
          , “
          <article-title>Machine Learning Techniques for Sentiment Analysis of Code-Mixed and Switched Indian Social Media Text Corpus: A Comprehensive Review</article-title>
          ,
          <source>” IJACSA</source>
          , vol.
          <volume>13</volume>
          , no.
          <issue>2</issue>
          , pp.
          <fpage>455</fpage>
          -
          <lpage>467</lpage>
          ,
          <year>2022</year>
          , doi: 10.14569/IJACSA.
          <year>2022</year>
          .
          <volume>0130260</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>M.</given-names>
            <surname>Ramzan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Aziz</surname>
          </string-name>
          , and
          <string-name>
            <given-names>M.</given-names>
            <surname>Ghafar</surname>
          </string-name>
          , “
          <article-title>A study of code-mixing and code-switching (Urdu and Punjabi) in children's early speech</article-title>
          ,
          <source>” Journal of Language and Linguistic Studies</source>
          , vol.
          <volume>17</volume>
          , no.
          <issue>2</issue>
          , pp.
          <fpage>869</fpage>
          -
          <lpage>881</lpage>
          ,
          <year>2021</year>
          , doi: 10.52462/jlls.60.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>N.</given-names>
            <surname>Fitri</surname>
          </string-name>
          and
          <string-name>
            <given-names>A. D.</given-names>
            <surname>Pamungkas</surname>
          </string-name>
          , “
          <article-title>The use of code switching and code mixing by Indofood and Unilever food advertisements on television</article-title>
          ,
          <source>” INFERENCE: Journal of English Language Teaching</source>
          , vol.
          <volume>5</volume>
          , no.
          <issue>3</issue>
          , pp.
          <fpage>251</fpage>
          -
          <lpage>260</lpage>
          , Dec. 2022-
          <fpage>Mar</fpage>
          .
          <year>2023</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>J. W. Y.</given-names>
            <surname>Ho</surname>
          </string-name>
          , “
          <article-title>Code-mixing: Linguistic form and socio-cultural meaning,”</article-title>
          <source>The International Journal of Language Society and Culture</source>
          , vol.
          <volume>21</volume>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>22</lpage>
          ,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>A.</given-names>
            <surname>Fanani</surname>
          </string-name>
          and
          <string-name>
            <surname>J. A. R. Z</surname>
          </string-name>
          . Ma'u, “
          <article-title>Code-switching and code-mixing in English learning process</article-title>
          ,
          <source>” LingTera</source>
          , vol.
          <volume>5</volume>
          , no.
          <issue>1</issue>
          , pp.
          <fpage>68</fpage>
          -
          <lpage>77</lpage>
          ,
          <year>2018</year>
          , doi: 10.21831/lt.v5i1.
          <fpage>14438</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>P.</given-names>
            <surname>Majumder</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Mitra</surname>
          </string-name>
          , and G. Paikray, “
          <article-title>Overview of FIRE Information Retrieval Evaluation,”</article-title>
          <source>Information Retrieval Journal</source>
          , vol.
          <volume>20</volume>
          , no.
          <issue>2</issue>
          , pp.
          <fpage>77</fpage>
          -
          <lpage>84</lpage>
          ,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>T.</given-names>
            <surname>Mikolov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Chen</surname>
          </string-name>
          , G. Corrado, and
          <string-name>
            <given-names>J.</given-names>
            <surname>Dean</surname>
          </string-name>
          , “
          <article-title>Eficient Estimation of Word Representations in Vector Space,”</article-title>
          <source>Proc. ICLR Workshop</source>
          ,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>