1. Preface

and the Science of Science

Jian Wu

0 2

Sarah Rajtmajer

0 1

Yi He

0 3 0 AI4SciSci'25: Workshop on the Artificial Intelligence and the Science of Science 1 College of Information Sciences and Technology, The Pennsylvania State University , University Park, PA, 16802 , USA 2 Computer Science, Old Dominion University , Norfolk, VA , United States 3 Department of Data Science, College of William & Mary , Williamsburg, VA 23185 , USA

2026

This preface document summarizes the theme, organization, review, schedules, and accepted papers of the 2nd International Workshop on Artificial Intelligence and the Science of Science (AI4SciSci'25). The workshop welcomes papers that report original works and published works. There were 10 papers reporting original works submitted for peer-review to this workshop. Out of these, 7 papers were accepted for this volume, 3 as regular papers, 2 as short papers, and 2 as extended abstracts. The workshop also accepted 2 published papers for presentation, but they are not included in the proceedings.

1. Preface

https://www.cs.odu.edu/~jwu/ (J. Wu); https://www.rajtmajerlab.net/ (S. Rajtmajer); https://yhe15.people.wm.edu/ (Y. He)

CEUR Workshop

ISSN1613-0073

2. Accepted Papers

Accepted papers are briefly introduced below. Keywords were provided by the authors.

2.1. Published Works

• Transforming Role Classification in Scientific Teams using LLMs and Advanced Predictive Analytics (Wonduk Seo and Yi Bu) Keywords: Author Role Classification, Large Language Models (LLMs), Predictive Analytics, SHAP Interpretability • Scientific Productivity and Practice in the Era of Large Language Models (Keigo Kusumegi, Xinyu Yang, Paul Ginsparg, Mathijs de Vaan, Toby Stuart and Yian Yin)

Keywords: Large Language Models, Productivity, Citation Analysis

2.2. Original Works – Full Papers

• Humans vs. LLMs on Open Domain Scientific Claim Verification: A Baseline Study (Benjamin Curtis, Stefania Dzhaman, Matthew Maisonave and Jian Wu) Keywords: Scientific Claim Verification, Large Language Model, Large Reasoning Model, Prompt Engineering • A Gradio-Based Toolkit for Remote Sensing Data Fusion Literature (Caleb Cheruiyot, Leidig Jonathan P and Jiaxin Du)

Keywords: Remote Sensing, Data Fusion, Knowledge Graph, Uncertainty Tagging, Gradio, BERT • Tracing Research Inequality in NLP: How Resource Disparities Shape Topic Trends and Methodological Difusion via Citations (Lizhen Liang and Bei Yu)

Keywords: Research Inequality, Citation Intent, Scholarly Text Processing,

2.3. Original Works – Short Papers

• A Case For Clarity: Knowledge Engineering And Its Evolving Role In Fundamental And Applied Medical Sciences (Henri Van Overmeire and Patrick Wouters) Keywords: Physiology, Medicine, Knowledge Engineering, Epistemology,Central Venous Pressure, IEER • Quantifying Contextual Hallucinations in NLP Research Papers Before and After the LLM Era (Adiba Ibnat Hossain, Miftahul Jannat Mokarrama and Hamed Alhoori ) Keywords: Hallucination, LLM, Scientific Writing, Context Inconsistency, Faithfulness, NLP, Academic Integrity

2.4. Original Works - Extended Abstracts

• Geography of Medical Knowledge: Scientific Focus, Disease Burden, and Research Response (Hongyu Zhou, Prashant Garg and Thiemo Fezter) Keywords: Research Responsiveness, Geographic Disparities, Disease Burden, LLM-enabled Knowledge Graph • Applying LLM to Library Metadata: Mapping Geography and Language in the Library of Congress Collection (Kai Li, Hongyu Zhou, Raf Guns, Tim C.E. Engels and Brian Dobreski) Keywords: Library Metadata, Large Language Models, Knowledge Geography, Cataloging, Cultural Representation

3. Keynote

We invited Dr. Daniel Acuña, Associate Professor in the Department of Computer Science at the University of Colorado at Boulder to deliver the Keynote. Daniel leads the Science of Science and Computational Discovery Lab. He works in science of science, a subfield of computational social science, and AI for science. He publishes research and builds web-based software tools to accelerate knowledge discovery. His current work aims to understand historical relationships, mechanisms, and optimization opportunities of knowledge production. Title and abstract of his talk follow. Estimating the predictability of questionable open-access journals Abstract: Questionable journals threaten global research integrity, yet manual vetting can be slow and inflexible. Here, we explore the potential of artificial intelligence (AI) to systematically identify such venues by analyzing website design, content, and publication metadata. Evaluated against extensive human-annotated datasets, our method achieves practical accuracy and uncovers previously overlooked indicators of journal legitimacy. By adjusting the decision threshold, our method can prioritize either comprehensive screening or precise, low-noise identification. At a balanced threshold, we flag over 1000 suspect journals, which collectively publish hundreds of thousands of articles, receive millions of citations, acknowledge funding from major agencies, and attract authors from developing countries. Error analysis reveals challenges involving discontinued titles, book series misclassified as journals, and small society outlets with limited online presence, which are issues addressable with improved data quality. Our findings demonstrate AI’s potential for scalable integrity checks, while also highlighting the need to pair automated triage with expert review.

4. Acknowledgments

We thank the JCDL’25 Chairs for supporting the workshop. We thank Northern Illinois University and Old Dominion University for providing technical support. We also thank Easychair for providing our review platform. Special thanks to Rochana R. Obadage, PhD candidate at Old Dominion University, for building and maintaining the website.

We also want to express our gratitude to the program committee members who dedicated time to reviewing papers. The following members are sorted in alphabetical order by last name. During the preparation of this preface, the authors used Grammarly and OpenAI GPT for grammar and spell checking.