<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>The FASEB journal 22 (2008) 338-342.
[6] R. Miner</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Overview of Cross-Lingual Mathematical Information Retrieval at FIRE 2025</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Ayushi Malik</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Pankaj Dadure</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Sahinur Rahman Laskar</string-name>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>UPES Dehradun</institution>
          ,
          <addr-line>Uttarakhand</addr-line>
          ,
          <country country="IN">India</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2025</year>
      </pub-date>
      <volume>11</volume>
      <fpage>17</fpage>
      <lpage>20</lpage>
      <abstract>
        <p>This abstract provides a short overview of the first edition of the shared task on Cross-Lingual Mathematical Information Retrieval (CLMIR) organized at the 17th Forum for Information Retrieval Evaluation (FIRE 2025). A more detailed discussion of approaches used by the participating teams is available in the track overview paper. The CLMIR shared task at FIRE 2025 is designed to encourage participants to develop retrieval approaches that can process queries in one language and retrieve relevant results written in another, thereby bridging the accessibility gap between diverse user groups. Although the CLMIR shared task witnessed encouraging interest with 9 teams registering, only 6 teams submitted their results.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Information Retrieval</kwd>
        <kwd>Mathematical Information Retrieval</kwd>
        <kwd>Cross-lingual Mathematical Information Retrieval</kwd>
        <kwd>Digital Libraries</kwd>
        <kwd>Search Engines</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        Natural Language Processing (NLP) is a transformative field that enables computers to understand,
interpret, and generate human language [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ][2]. With advancements in machine learning and deep
learning, NLP applications have expanded from text summarization [3] and question answering [4] to
complex tasks like cross-lingual understanding. A key domain where NLP shows practical significance is
Information Retrieval (IR) [5], which focuses on searching, ranking, and retrieving relevant information
from large repositories. Traditional IR systems relied on keyword or semantic-based retrieval [6][7]
of text data, but the growth of digital information has introduced structured, semi-structured, and
multimodal data [8][9] [10][11][12]. Among these, mathematical data poses particular challenges due
to its symbolic, non-linear structure and representation in languages like LaTeX and MathML.
Generalpurpose IR systems often fail to process mathematical queries efectively, leading to the emergence of
Mathematical information retrieval (MIR) [13]. MIR deals with retrieving mathematical expressions,
equations, and theorems while addressing issues of structural representation and integration with
textual data. Despite advancements, most MIR systems like MathWebSearch [14][15][16], SimSearch
[17][18][19], Mathcat [20][21] Tangent [22][23][24][25][26][27][28][29], and Wolfram Alpha [30][31]
remain monolingual, predominantly focusing on English. This limits the accessibility of researchers
and learners working in multiple languages. To bridge this gap, cross-lingual mathematical information
retrieval (CLMIR) comes into the picture. CLMIR extends the scope of MIR by enabling users to query in
one language and retrieve mathematical data in another language [32]. In CLMIR, a user may enter the
query in one language (i.e., English) and retrieve relevant mathematical data written in another language
(i.e., Hindi), thereby bridging the linguistic barriers in accessing mathematical knowledge. However,
CLMIR presents a unique set of challenges. These include the scarcity of cross-lingual mathematical
datasets, dificulties in aligning mathematical symbols with multilingual textual descriptions, handling
structural and semantic variations of mathematical expressions across languages, and ensuring efective
ranking of results in cross-lingual settings. Addressing these challenges is essential for developing
robust approaches that can support cross-lingual access to mathematical knowledge on a global scale
[33].
      </p>
    </sec>
    <sec id="sec-2">
      <title>2. Track Description</title>
      <p>The CLMIR shared task at FIRE 2025 is designed to address the challenges of retrieving mathematical
data across languages, with a specific focus on the English-Hindi cross-lingual environment. The
task encourages the participants to develop retrieval approaches that can process the queries in one
language and retrieve relevant results written in another. The primary objectives of CLMIR are to
promote research in cross-lingual retrieval of mathematical data and to create a benchmark dataset for
English-Hindi MIR, enabling a fair comparison of retrieval models. To encourage the development of
novel retrieval approaches that combine symbolic mathematics understanding with cross-lingual natural
language processing. This is the first shared task focused on cross-lingual retrieval of mathematical
content, making CLMIR an important step toward cross-lingual accessibility in STEM education and
research.</p>
      <sec id="sec-2-1">
        <title>2.1. Use Case/s</title>
        <p>The use case example demonstrates how the CLMIR model retrieves relevant mathematical content
across languages. It shows a query in English and the corresponding relevant and irrelevant results in
Hindi, highlighting the model’s ability to match both mathematical expressions and textual meaning
accurately.</p>
      </sec>
      <sec id="sec-2-2">
        <title>2.2. Participation</title>
        <p>A total of 9 teams, Archisha Dhyani (UPES), CJM (IISER Kolkata), Tends (NIT, Kurukshetra), IReL
(IIT BHU Varanasi), j22 (IISER Kolkata), NLP Fusion (Mangalore University), MathQA_AUS (Assam
University), DUCS_CLMIR ( University of Delhi), Retriever (Dhirubhai Ambani University), have been
registered for the task. Out of these 9 teams, only 6 teams, Archisha Dhyani, Tends, IReL, NLP Fusion,
DUCS_CLMIR, and Retriever have been able to submit their results. Among these 6 teams, only 4 Tends,
IReL, NLP Fusion, and Retriever, have been successful in submitting the working notes.</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. Dataset and Evaluation</title>
      <p>The dataset for the CLMIR task is curated from the Math Stack Exchange corpus of ARQMath-1
and contains approximately 39,862 instances. The dataset is formatted to include titles of scientific
information and the body, which contains mathematical formulas expressed in LaTeX and supporting
textual descriptions. The provided corpus is structured to maintain cross-lingual consistency between
two languages, Hindi and English, ensuring that the mathematical content and contextual information
are accurately aligned across both languages. The data covers content in Hindi, including mathematical
equations, expressions, and related textual information. It is designed to support the development and
evaluation of cross-lingual mathematical information retrieval approaches. To assess the performance
of participants, 50 formulas and text-based queries will be provided. Participants are expected to submit
a results file containing the relevant search results corresponding to each query. The performance of
participants’ approaches in the CLMIR 2025 task will be evaluated using three key metrics, which are
Precision@10 (P@10), Mean Average Precision (MAP), and Normalized Discounted Cumulative Gain
(nDCG).</p>
    </sec>
    <sec id="sec-4">
      <title>4. System Description</title>
      <p>The CLMIR shared task attracted diverse participation, with teams proposing a variety of approaches
to address the challenges of retrieving mathematical data across languages. This section provides a
detailed description of the baseline approach and the methodologies submitted by participating teams.</p>
      <sec id="sec-4-1">
        <title>4.1. Baseline System (Organizer)</title>
        <p>The baseline system was developed by initially translating the input query into Hindi using the Google
Translate API to ensure consistency in language representation. Following translation, embeddings
for both the query and the training data were generated using the
sentence-transformers/paraphrasemultilingual-mpnet-base-v2 model. This model was selected due to its proven multilingual capability,
robust performance in capturing semantic similarity across languages, and demonstrated efectiveness
for Hindi text. Subsequently, cosine similarity was employed as the similarity estimation technique
to measure the closeness between the query and training data embeddings. Based on these similarity
scores, the top 50 most relevant search results were retrieved.
4.2. Tends
Team tends designed a cross-lingual retrieval pipeline that integrates machine translation, dense retrieval,
and re-ranking to address the CLMIR task. The Hindi corpus from ARQMath-1 was first translated into
English using the ai4bharat IndicTrans2 model, ensuring both queries and documents resided in the same
language space for retrieval. Each document was represented as a composite of its title, body, and tags,
which were encoded into contextualized token embeddings using CoIBERTv2, a late interaction dense
retrieval model optimized for fine-grained query document matching. The embeddings were stored
in a Facebook AI Similarity Search (FAISS) index, enabling eficient approximate nearest neighbour
search. At retrieval time, FAISS retrieved the top 100 candidate documents, which were further refined
through a cross encoder (MiniLM-L6-v2) that jointly encoded each query document pair to generate
precise semantic relevance scores. The final ranked list of the top 50 documents was returned as output.
Since the corpus was fully translated into English, incoming English queries, including both text and
mathematical expressions, were directly processed without any additional translation, making the
system both eficient and semantically robust [34].
4.3. IReL
Team IReL suggested a hybrid retrieval system that integrates multilingual embeddings and symbolic
formula similarity. Their pipeline first applied preprocessing to separate textual and mathematical
components, transliterating Hindi text into Latin script for compatibility with transformer-based
encoders. For semantic alignment, they used MiniLM embeddings in Run 1 and MPNet embeddings
in Run 2, with candidate retrieval performed through FAISS indexing. Mathematical similarity was
modeled using two strategies that are a lightweight token overlap method in Run 1 and a more expressive
tree-based structural similarity in Run 2 [35].</p>
      </sec>
      <sec id="sec-4-2">
        <title>4.4. Retriever</title>
        <p>Team retriever developed a two-stage English-Hindi CLMIR system combining BM25-based sparse
retrieval with LLM-based pointwise re-ranking. English text formula queries were translated into Hindi
using GPT-4, and BM25 retrieved the top 100 candidate documents from ARQMath-1 corpora. These
candidates were then re-ranked using instruction-tuned Gemma-2 models(4B and 12B) in a zero-shot
setting, with binary relevance predictions converted into probabilities for ranking [36].</p>
      </sec>
      <sec id="sec-4-3">
        <title>4.5. NLP_Fusion</title>
        <p>
          Team NLP_Fusion, proposed a semantic retrieval system for the CLMIR 2025 shared task, focusing
on English-Hindi cross-lingual mathematical information retrieval. Their approach utilized
sentencetransformer/all MiniLM-L6-v2 embeddings to generate multilingual semantic representations of
documents and queries, combined with FAISS for eficient vector-based similarity search. The Hindi
documents from ARQMath-1 were preprocessed by concatenating titles, bodies, and tags, then split into
manageable chunks to preserve contextual coherence. The system retrieved the top 50 most relevant
chunks for each query, with additional post-processing to secure scores within the [
          <xref ref-type="bibr" rid="ref1">0,1</xref>
          ] range for
consistency [37].
        </p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. Experimental Results and Analysis</title>
      <p>We evaluated all team submissions for the CLMIR 2025 shared task using three metrics, which are
P@10, MAP, and nDCG. The results are shown in Table 1. Each team was allowed up to three runs, and
the diferences in performance reflect the design of their systems. The baseline system used Google
Translate to convert queries into Hindi, generated multilingual embeddings with the MPNet model,
and ranked documents using cosine similarity. Team Archisha Dhyani showed steady improvements,
with Run 3 giving the best results, which suggests that their system became stronger after refinements.
DUCS_CLMIR produced stable results across runs, but at a lower level compared to other teams, likely
because their method did not add new improvements across submissions. IReL performed much better
in Run 2, with a hybrid retrieval system that integrates multilingual embeddings and symbolic formula
similarity. NLP fusion gave consistent results across all three runs, with the third run being the strongest.
They utilized sentence-transformer/all MiniLM-L6-v2 embeddings to generate multilingual semantic
representations of documents and queries, combined with FAISS for eficient vector-based similarity
search. Retriever improved a lot in later runs, with Run 3 achieving one of the best MAP and nDCG
scores overall. Their combination of BM25 retrieval with large language model re-ranking helped
capture both word overlap and deeper meaning. Finally, Tends submitted only one run, but it still
achieved strong results. Their system, which used translation, dense retrieval, and re-ranking, showed
solid performance even without multiple submissions.
Run 1 0.0280
Run 2 0.0440
Run 3 0.0500</p>
    </sec>
    <sec id="sec-6">
      <title>6. Conclusion</title>
      <p>The CLMIR shared task at FIRE 2025 represents the first attempt to systematically evaluate the retrieval
of mathematical data across languages, with a focus on the English-Hindi pair. By providing a large-scale
cross-lingual dataset, well-defined queries, and a clear evaluation framework, this task establishes a
foundation for research in a less-explored area of cross-lingual mathematical information retrieval.
In future editions, we aim to expand the dataset to cover additional language pairs and incorporate
more diverse query types (including multi-formula and contextual queries). Furthermore, by releasing
resources such as baseline models, smaller benchmark datasets, and multilingual extensions, we intend
to make the task more accessible and encourage participation from a broader range of researchers.
Through these steps, CLMIR seeks to promote innovation in cross-lingual mathematical information
retrieval, promote access to STEM resources, and build a long-term research community dedicated to
advancing cross-lingual access to mathematical knowledge.</p>
    </sec>
    <sec id="sec-7">
      <title>Acknowledgement</title>
      <p>This research work is supported under the Visvesvaraya PhD Scheme for Electronics and IT, implemented
by the Ministry of Electronics and Information Technology (MeitY), Government of India. The authors
also thank UPES Dehradun for providing the necessary support and research infrastructure.</p>
    </sec>
    <sec id="sec-8">
      <title>Declaration on Generative AI</title>
      <p>The author(s) have not employed any Generative AI tools.
[22] K. D. V. Wangari, R. Zanibbi, A. Agarwal, Discovering real-world use cases for a multimodal math
search interface, in: Proceedings of the 37th international ACM SIGIR conference on Research &amp;
development in information retrieval, 2014, pp. 947–950.
[23] D. Stalnaker, R. Zanibbi, Math expression retrieval using an inverted index over symbol pairs, in:</p>
      <p>Document recognition and retrieval XXII, volume 9402, SPIE, 2015, pp. 34–45.
[24] R. Zanibbi, K. Davila, A. Kane, F. Tompa, The tangent search engine: Improved similarity metrics
and scalability for math formula search, arXiv preprint arXiv:1507.06235 (2015).
[25] R. Zanibbi, K. Davila, A. Kane, F. W. Tompa, Multi-stage math formula search: Using
appearancebased similarity metrics at scale, in: Proceedings of the 39th International ACM SIGIR conference
on Research and Development in Information Retrieval, 2016, pp. 145–154.
[26] K. Davila, R. Zanibbi, Layout and semantics: Combining representations for mathematical formula
search, in: Proceedings of the 40th International ACM SIGIR Conference on Research and
Development in Information Retrieval, 2017, pp. 1165–1168.
[27] D. Fraser, A. Kane, F. W. Tompa, Choosing math features for bm25 ranking with tangent-l, in:</p>
      <p>Proceedings of the ACM Symposium on Document Engineering 2018, 2018, pp. 1–10.
[28] W. Zhong, R. Zanibbi, Structural similarity search for formulas using leaf-root paths in operator
subtrees, in: Advances in Information Retrieval: 41st European Conference on IR Research, ECIR
2019, Cologne, Germany, April 14–18, 2019, Proceedings, Part I 41, Springer, 2019, pp. 116–129.
[29] B. Mansouri, S. Rohatgi, D. W. Oard, J. Wu, C. L. Giles, R. Zanibbi, Tangent-cft: An embedding
model for mathematical formulas, in: Proceedings of the 2019 ACM SIGIR international conference
on theory of information retrieval, 2019, pp. 11–18.
[30] V. E. Dimiceli, A. S. Lang, L. Locke, Teaching calculus with wolfram| alpha, International Journal
of Mathematical Education in Science and Technology 41 (2010) 1061–1071.
[31] W. N. S. Wan Mohd Rosly, S. S. Syed Abdullah, F. N. Ahmad Shukri, The uses of wolfram alpha in
mathematics, Teaching and Learning in Higher Education (TLHE) 1 (2020) 96–103.
[32] J. Gore, J. Polletta, B. Mansouri, Crossmath: Towards cross-lingual math information retrieval, in:
Proceedings of the 2024 ACM SIGIR International Conference on Theory of Information Retrieval,
2024, pp. 101–105.
[33] A. Malik, P. Dadure, S. R. Laskar, A review of mathematical information retrieval: Bridging symbolic
representation and intelligent retrieval, Archives of Computational Methods in Engineering (2025)
1–35.
[34] A. Sur, V. Shukla, A. Rai, L. Garg, Enhancing multilingual mathematical document retrieval:
A hindi to english translation and colbert based approach, in: Proceedings of the Forum for
Information Retrieval Evaluation (FIRE), ACM, 2025.
[35] K. Tewari, S. Chanda, R. Tripathi, S. Pal, Miracle: Multilingual information retrieval with
crosslingual embeddings for mathematical expressions, in: Proceedings of the Forum for Information
Retrieval Evaluation (FIRE 2025), To appear, 2025. Accepted for publication.
[36] K. Kachhadiya, P. Patel, Cross-lingual mathematical information retrieval with bm25 and llm-based
pointwise re-ranking, in: Proceedings of the Forum for Information Retrieval Evaluation (FIRE
2025), To appear, 2025. Accepted for publication.
[37] K. S. Coelho, A. Hegde, M. Z. Taljeh, A. Ahmad, Math beyond language barriers: Retrieving
mathematical content using sentence transformers, in: Proceedings of the Forum for Information
Retrieval Evaluation (FIRE 2025), To appear, 2025. Accepted for publication.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>F.</given-names>
            <surname>Enríquez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F. L.</given-names>
            <surname>Cruz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F. J.</given-names>
            <surname>Ortega</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C. G.</given-names>
            <surname>Vallejo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. A.</given-names>
            <surname>Troyano</surname>
          </string-name>
          ,
          <article-title>A comparative study of classifier combination applied to nlp tasks</article-title>
          ,
          <source>Information Fusion</source>
          <volume>14</volume>
          (
          <year>2013</year>
          )
          <fpage>255</fpage>
          -
          <lpage>267</lpage>
          . URL: https://www.
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>