1. Introduction

Research Inequality in NLP: How Resource Disparities Shape Topic Trends and Methodological Difusion via Citations

Lizhen Liang

lliang06@syr.edu 0

Bei Yu

0 0 Syracuse University , Syracuse, New York , United States

2026

The growing resource gap between institutions raises critical questions about transparency, replicability, and inclusiveness in AI research. While some AI research topics remain accessible, research in areas such as large language models (LLMs) necessitate more resources such as computational power and data access: resources largely concentrated among industry companies and a few top universities. This study investigates research inequality in NLP by analyzing topic shifts, institutional resource gap, and citation intent patterns in papers from rise of large language models (LLMs) and generative tasks, which have driven increased attention to topics such as Language Modeling, Generation, and Multimodality, while traditional areas like Machine Translation and Syntax/Parsing have declined. High-resource institutions are more likely to publish on these trending topics, as indicated by higher topic shift ratios. In contrast, low-resource teams are concentrated in declining topics. Citation intent analysis reveals that methodology-use citations, which indicate resource transfer, are decreasing over time, particularly in trending topics. This trend is especially pronounced in citations from low-resource to high-resource teams, suggesting that widening computational and infrastructural gaps limit the ability of low-resource institutions to adopt and build upon frontier research. These findings highlight a growing divide in NLP research participation and impact, underscoring the need for more inclusive and equitable research practices.

Methodological

1. Introduction

Modern AI research demands increasing resources, especially access to large-scale infrastructure and datasets, creating a significant advantage to institutions with greater financial and computational capacity. For instance, in 2020, private enterprises reportedly spent over $80 billion on AI, while U.S. federal non-defense investment in AI-related research and development amounted to just $1.5 billion [ 1 ]. This disparity has enabled well-resourced teams, especially those afiliated with major technology companies, to drive the development of increasingly sophisticated AI models. In contrast, many academic and public-sector institutions lack the resources necessary to reproduce, extend, or critically evaluate these advances [ 2 ], raising concerns about the inclusiveness and reproducibility of progress in the field.

This resource gap not only afects what institutions can build but also shapes what research questions they choose to ask [ 3 ]. While industry actors often drive progress through proprietary models that require vast resources [ 4 ], academic and under-resourced teams often focus on problems that are more computationally tractable or theoretically grounded [ 5, 6 ].

Despite this significant resource disparity, the growing availability of open-source software frameworks, pretrained models, and benchmark datasets, has contributed to a broader participation in AI research [ 7 ], as evidenced by the influx of new authors in recent years [ 3 ]. This raises a critical question: to what extent do high-resource teams, while pushing the frontier, also act as enablers (e.g., through https://liamliang.github.io/ (L. Liang); https://ischool.syracuse.edu/bei-yu/ (B. Yu)

CEUR Workshop

ISSN1613-0073 the release of resources) for broader access? To better understand this dynamic, we focus on the NLP research community and address two research questions: 1) How do research topics difer between research teams from high-resource and low-resource institutions? And 2) To what extent has research from high-resource institutions lowered or heightened the barriers for low-resource teams in NLP?

To investigate the first question, we analyzed temporal shifts in topic distributions across institutions with varying resource levels, examining whether low-resource teams are increasingly constrained in the scope of topics they pursue.

Drawing on theories of citation that consider citations as framing devices and connectors of intellectual lineages [ 8 ], and previous studies that has used citation analysis to understand research dynamics [ 9, 10, 11, 12 ], we investigate our second research question by analyzing citation patterns. Specifically, we examine how low-resource teams cite the work of high-resource teams, with a focus on methodological adoption, such as the use of models, datasets, or software developed by high-resource teams. This approach interprets methodology-use citations as a proxy for resource transfer from the cited institutions to the citing institutions, as the increasing prevalence of methodology-use citations has been linked to the growing availability of reusable technologies and evaluations [ 13, 11 ].

2. Data and Methods

In this work, we synthesized data from various sources and trained two prediction models to generate variables for downstream analyses. To study research topic shift, we retrieved titles and abstracts of ACL Anthology papers published after 2010 from the ACL-OCL corpus [ 14]. We selected a smaller time window since AI research intensified in the past 10-15 years. Each paper’s author afiliation metadata was retrieved from OpenAlex, an openly accessible database containing metadata on scientific research publications [15]. To estimate each institution’s resource level, we generated a proxy variable predicted by a machine learning model trained on research expenditure data and bibliometric features. Citation context data were obtained through Semantic Scholar’s S2AG API [16]. To analyze patterns of methodological difusion, we fine-tuned a citation intent classifier to identify method-use citations, which are instances where the citing paper adopts tools, models, or methods from the cited work. The following subsections describe the above tasks respectively.

2.1. Modeling Research Topic Shift

The ACL-OCL corpus includes the full text of 73k papers from the ACL Anthology up to September 2022. We selected the papers published since 2010, since AI research intensified in the past 10-15 years.

We used paper titles and abstracts as input for topic modeling. We first embedded each paper’s title and abstract using the SPECTER 2 language model [17]. The resulting embeddings were then reduced in dimensionality using UMAP. HDBSCAN was then applied to further clustered the dimension-reduced embeddings into topic clusters.

Using topic coherence [18] as the evaluation metric, we compared BERTopic [19] under various parameter settings. The highest coherence score was achieved by a BERTopic model configured with 275 neighbors, 125 UMAP components, and a minimum cluster size of 275 for HDBSCAN.

We further validated the model by manually reviewing sample articles and representative keywords from each topic cluster, comparing them with the ACL submission topics. Using this comparison, we were able to assign each topic cluster a label based on keywords from the ACL topic list (see Table 3 in the appendix). Finally, each paper was assigned a topic label based on the highest topic probability generated by the topic model.

We then calculated the topic shift ratio to measure the annual change in a research topic’s popularity, i.e. whether it gained or lost attention, using the following equation:

Topic Shift Ratio = ( P(aPpaepreirs iasssaisgsnigendetdottooptoicpi∣c P∣aPpaepreprupbulibslhisehdebdeifnoryeeyaer a)r ) (1)

If the topic shift ratio for Topic X is greater than 1 in Year Y, it means that Topic X became more prevalent in Year Y, compared to its prevalence in the years before.

A slightly modified equation was designed to compare the popularity of a topic before and after a cutof year, e.g. 2016:

Topic Shift Ratio = ( P(aPpaepreisr iasssaisgsnigendetdottooptoicpi∣c P∣aPpaepreprupbulibslhisehdebdeafofryeeyare a)r ) (2)

2.2. Estimating Institutional AI Resources

We trained a regression model to estimate an institution’s AI resource. The training data comes from the 2023 Higher Education Research and Development (HERD) Survey. The HERD Survey is an annual census of 501 U.S. colleges and universities that expended at least $150,000. The survey includes data for various research areas. We used expenditure in the area of information and computer science as a proxy measure for an institution’s AI research resource.

Bibliometric features have long served as tools in the science of science [20] and the scientometric community [21]. Employing these features allows for investigations of the characteristics and dynamics inherent in scientific activities and entities. We extracted author afiliations for each ACL paper from OpenAlex, an openly accessible database containing metadata on scientific research publications. For each institution, we aggregated 15 bibliometric features, including (1) basic counts, i.e. the number of publications, citations, co-institutions, and researchers for each institution; (2) researcher seniority for each institution including the mean, median, min, and max researcher h-index; (3) outbound citation targets, such as the number of unique authors, institutions, and publishing venues (such as journals, conference proceedings); (4) outbound citations aggregated at diferent research entity level, such as author-, institution-, and publishing venue-level citations. See Table 5 in the appendix for a full list of features and their definitions.

Using the bibliometric features and expenditure data as training data, we trained and cross-validated linear regression and random forest regression models with diferent hyper-parameters. The model with the best performance is a random forest regression model with the maximum depth set to 20, minimal samples split set to 2. The model achieves 0.407 R-squared on the testing dataset and 0.907 R-squared on the training dataset.

Using the best prediction model and institution-level bibliometric data for all institutions that ACLOCL authors are afiliated with, we predicted a pseudo-expenditure value for each afiliation as a proxy for the amount of their AI resource. The feature importance of the model is shown in Table 4 in the appendix. We found that the features “Number of citations” and “Number of publications” are among the most important features. It makes sense since research spending should be positively correlated with the size and capacity of institutions.

2.3. Identifying Methodology-use Citations

For each citation to ACL-OCL papers, we retrieved the citation context, or the citation sentence, from the S2AG database. We applied the method proposed by [22] because it doesn’t require external data such as author and afiliation information to achieve performance comparable to the state-of-the-art. We fine-tuned a SciBERT model using SciCite and ACL-ARC data as a multi-task learning task.

The ACL-ARC dataset [13] provides annotations for citation intents with six classes, including Extends, Future, Motivation, Compares, Uses, and Background, for 1,969 citation sentences from 10 ACL Anthology articles.

The SciCite data set includes 11,020 citation sentences from computer science and medicine articles sampled from the Semantic Scholar corpus [23]. The SciCite data schema was simplified based on ACL-ARC, after removing citation intent categories that are rare or not useful for meta-analysis of scientific publications. SciCite includes three categories: background information, use of methods, and comparing result. Here we refer to them briefly as background, methodology, and result citations.

Using SciCite as the main training set and ACL-ARC as the auxiliary training set, the resulting model has achieved a macro 0.86 F1 score on the SciCite dataset, with balanced precision and recall values on all categories. This result is comparable to [24]. See Table 1 for category-level performance.

Using the fine-tuned citation intent classification model, we predicted citation intent for each citation context retrieved from S2AG.

3. Result 3.1. Research Topic Shifts in NLP 3.1.1. Topic Diferences Between Low-resource and High-resource Teams

Next, we compared the distribution of research topics between high-resource and low-resource teams. Using our regression model to estimate each institution’s AI resource level, we classified the top 10% of institutions as high-resource, and the remaining 90% as low-resource. We define a research team as the group of authors on a single paper. The team’s resource level is determined by the highest-ranked institution among the authors’ afiliations. We then assigned each paper a topic shift ratio, based on its publication year and its assigned research topic (see Figure1).

We found that papers from high-resource teams were associated with significantly higher topic shift ratios than those from low-resource teams, indicating that high-resource teams are more likely to publish on trending topics. This diference is statistically significant, as shown by a Mann–Whitney U test (U = 230,316,303.0, p < 0.001).

We then compared the topic distributions between low-resource and high-resource research teams by counting the number of papers published in each topic and applying chi-squared tests to assess statistical diferences. Figure 3 shows the residuals from these tests for papers published in 2016 ( 2 = 70.913, < 0.0001 ), 2018 ( 2 = 55.188, < 0.0001 ), and 2020 ( 2 = 58.455, < 0.0001 ). The topics are ordered by their overall trend, with those declining in popularity near the top and trending topics near the bottom (based on topic shift ratios calculated in 2016 and held consistent across all panels).

Positive residuals (in blue color) indicate over-representation by high-resource teams, and negative residuals (in red color) indicate over-representation by low-resource teams. The pattern across all three years suggests a persistent topic divide: low-resource teams are more concentrated in declining topics, while high-resource teams increasingly dominate emerging and computationally intensive areas.

A. 2016

B. 2018

C. 2020

Declining Considering methodology-use citation as an indicator of “resource transfer” from one institution to another, we analyzed the intent of citations to the ACL Anthology papers. Overall, about one half of citations are background information, about one third on methdology use, and the remaining on result discussion. Figure 4A presents the proportions of these citation intent types, normalized by the number of citations per year, from 2010 to 2022, illustrating a trend that background citations are increasing (Mann Kendall test: = 0.923, < 0.0001 ), while methodology-use citations (Mann Kendall test: = −0.718, < 0.001 ) and result citations (Mann Kendall test: = −0.923, < 0.001 ) have been declining.

As NLP literature expands, it is not surprising to see researchers citing more prior work as background information. However, the decrease in methodology-use citations needs further examination to see whether it indicates a decline in resource transfer, since the growing resource gap may prevent lowresource teams from adopting certain methods due to limitations in computing power, data access, and funding. If this is true, we should expect the increase in background citations and the decrease in methodology-use citations to be more pronounced in citations from low-resource teams to high-resource teams, especially among trending topics, since high-resource teams are more likely to work on trending topics.

Background Methodology ttn A. All Citations Ine0.50 n o iittt a C n ree0.35 ifff D o n o itrpo0.20 roP 2010 2016 Year

Result

t B. Background Citations tn C. Methodology Citations 2022 iiiifftfttttIrrreoennpoooanenonPDC000...445824 2010 2022 iiiifftftttItrrreenonoaopoonnePDC000...323063 2010 2016 Year 2016

Year Declining

To compare methodology-use citations in trending and declining topics, we selected papers from the top five most trending topics ( Language Model, Computational Social Science and Cultural Analytics, Generation, Question Answering, Multimodality and Language Grounding to Vision, Robotics and Beyond), and the top five most declining topics ( Grammar Correction, Resources and Evaluation, Syntax: Tagging, Chunking and Parsing / ML, Machine Translation, Speech recognition).

Figure 4B shows that background citations have increased over time for both trending and declining topics. This trend is supported by the Mann-Kendall test results: for trending topics, = 0.769 , < 0.001 , with a slope of 0.0046 and intercept of 0.50; for declining topics, = 0.718 , < 0.001 , with a slope of 0.0061 and intercept of 0.43. While both trends are significant, the higher intercept for trending topics suggests that they generally require more background citations than declining topics. This is consistent with expectations: researchers working in fast-evolving areas may need to cite a broader base of prior work to contextualize and support their arguments.

Figure 4C shows the proportion of methodology-use citations over time for both trending and declining topics. For trending topics, the Mann-Kendall test reveals a decreasing trend in Methodology citation proportion ( = −0.0022, = 0.2911 ), whereas declining topics show a more stable and higher baseline level ( = −0.0001, = 0.3150 ). This suggests that in fast-moving areas, researchers are less likely to cite existing models, datasets, or tools. We further examined whether low-resource teams experienced a more pronounced decline in methodology citations to work produced by high-resource teams.

Figure 5 clearly shows a declining trend of methodology-use citations from low-resource teams to high-resource teams, with the downward trend accelerating after 2016 ((Mann Kendall test: = −0.846, < 0.0001, = −0.0058, = 0.3310 ). In comparison, the overall trend across all ACL papers shows a much more gradual decline (Mann Kendall test: = −0.718, < 0.001, = −0.0014, = 0.3082 ).

The strong and accelerating decrease in methodology-use citations from low-resource teams to high-resource teams suggests that it is increasingly more challenging for low-resource teams to engage with or build upon the methodologies developed by high-resource teams.

Proportion of Methodology Citations 0.33 0.30 0.27 2010 2Y0e1a6r Low-Resource to High-Resource

3.2.2. Citations from Low-resource Teams to High-resource Teams between Trending Topics and Declining Topics

Combining the resource and topic factors, we further compared methodology-use citations from lowresource teams to high-resource teams among trending and declining topics.

Figure 6C shows the same trend that methodology-use citations have been declining for both trending topics (Mann Kendall test: = −0.600, < 0.05, = −0.0067, = 0.2983 ) and declining topics (Mann Kendall test: = −0.527, < 0.05, = −0.0047, = 0.3448 ). Additionally, methodology-use citations are consistently less prevalent in trending topics (average proportion for trending topics: 0.27; declining topics: 0.32). Both patterns support the interpretation that low-resource Background

Methodology

Result A. Trending Topics

B Declining Topics

C. Methodology Citations 0.50 0.35 0.20 2012 2016Year 2022 2012

2016Year 0.50 0.35 0.20 0.36 0.30 teams face greater challenges in adopting methods from high-resource teams, particularly in trending research areas.

To further validate our findings, we conducted a linear regression analysis, using the proportion of methodology citations received by each cited paper as the dependent variable. We aggregated methodology-use citations at the paper level and included three key predictors: (1) the maximum predicted research expenditure among the cited paper’s co-author afiliations as a proxy for institutional resource level, (2) the publication year, and (3) the normalized topic popularity of the cited paper based on the topic popularity ranking from Figure 1.

As shown in Table 2, the results indicate that the proportion of methodology-use citations is significantly and negatively associated with cited team’s resource level, the popularity of the cited paper’s topic, and publication year. In additional models, we find that larger gaps in resource levels and topic popularity between citing and cited teams are also significantly associated with fewer methodology citations. These findings reinforce the interpretation that institutional and topical asymmetries may increasingly constrain methodological reuse, particularly disadvantaging low-resource teams when citing high-resource work in trending areas.

The results combined provide more evidence that resource barriers limit the adoption of methods proposed by high-resource teams, and such a phenomenon is more serious for publications related to trending topics. According to [13], such a trend could indicate a decrease in reusable technologies such as models and datasets, and evaluations of tools. Such a decrease could be related to the increasing resource gap in the field of AI. We also visualized the proportion of non-methodology citations. Latour [ 8 ] suggests that non-methodological citations are important for defending proposed ideas. We can interpret from the increase in non-methodology citations that there is an increased need to defend newly proposed ideas, and increasingly less consensus in the ACL anthology community. Such interpretation makes sense as AI is fast-growing, and new ideas need to be introduced to an increasingly more interdisciplinary community.

4. Conclusion

In this work, by analyzing the research topics and citation intent, we investigate the disparities between low-resource and high-resource institutions in the natural language processing research community. Our findings indicate that high-resource teams have been focusing on research topics that are gaining popularity, whereas low-resource teams have been more likely to work on topics that are becoming less prominent. This suggests that access to resources such as computational power and large datasets plays a significant role in determining what research topic a team can study. Furthermore, our result reveals that research produced by high-resource teams is becoming increasingly dificult for other researchers to build upon. Such results suggest a growing divide in AI research, where advancements driven by high-resource industry corporations and universities may inadvertently limit the accessibility of cutting-edge research to those with fewer resources. These findings indicate the need for more inclusive research practices and collaborative eforts to ensure that AI innovation remains accessible to a broader research community. Future work should explore potential strategies for bridging this gap.

5. Declaration on Generative AI

During the preparation of this work, ChatGPT was used for editing to improve the style and tone of writing. [12] K. Nishikawa, How and why are citations between disciplines made? A citation context analysis focusing on natural sciences and social sciences and humanities, Scientometrics 128 (2023) 2975–2997.

URL: https://link.springer.com/10.1007/s11192-023-04664-y. doi:10.1007/s11192- 023- 04664- y. [13] D. Jurgens, S. Kumar, R. Hoover, D. McFarland, D. Jurafsky, Measuring the Evolution of a Scientific Field through Citation Frames, Transactions of the Association for Computational Linguistics 6 (2018) 391–406. URL: https://direct.mit.edu/tacl/article/43437. doi:10.1162/tacl_a_00028. [14] S. Rohatgi, Y. Qin, B. Aw, N. Unnithan, M.-Y. Kan, The ACL OCL corpus: Advancing open science in computational linguistics, in: H. Bouamor, J. Pino, K. Bali (Eds.), EMNLP, acl, Singapore, 2023. [15] J. Priem, H. Piwowar, R. Orr, Openalex: A fully-open index of scholarly works, authors, venues, institutions, and concepts, arXiv preprint arXiv:2205.01833 (2022). [16] A. D. Wade, The semantic scholar academic graph (s2ag), in: Companion Proceedings of the Web

Conference 2022, 2022, pp. 739–739. [17] A. Cohan, S. Feldman, I. Beltagy, D. Downey, D. Weld, SPECTER: Document-level representation learning using citation-informed transformers, Online, 2020. [18] M. Röder, A. Both, A. Hinneburg, Exploring the Space of Topic Coherence Measures, in: Proceedings of the Eighth ACM International Conference on Web Search and Data Mining, WSDM ’15, Association for Computing Machinery, New York, NY, USA, 2015, pp. 399–408. URL: https://doi.org/10.1145/2684822.2685324. doi:10.1145/2684822.2685324. [19] M. Grootendorst, Bertopic: Neural topic modeling with a class-based tf-idf procedure, arXiv preprint arXiv:2203.05794 (2022). [20] S. Fortunato, C. T. Bergstrom, K. Börner, J. A. Evans, D. Helbing, S. Milojević, A. M. Petersen, F. Radicchi, R. Sinatra, B. Uzzi, A. Vespignani, L. Waltman, D. Wang, A.-L. Barabási, Science of science, Science 359 (2018) eaao0185. URL: https://www.science.org/doi/10.1126/science.aao0185. doi:10.1126/science.aao0185. [21] L. Leydesdorf, S. Milojević, Scientometrics, arXiv preprint arXiv:1208.4566 (2012). [22] Z. Shui, P. Karypis, D. S. Karls, M. Wen, S. Manchanda, E. B. Tadmor, G. Karypis, Fine-Tuning Language Models on Multiple Datasets for Citation Intention Classification, 2024. URL: http: //arxiv.org/abs/2410.13332. doi:10.48550/arXiv.2410.13332, arXiv:2410.13332 [cs]. [23] A. Cohan, W. Ammar, M. van Zuylen, F. Cady, Structural Scafolds for Citation Intent Classification in Scientific Publications, 2019. URL: http://arxiv.org/abs/1904.01608. [24] L. Paolini, S. Vahdati, A. Di Iorio, R. Wardenga, I. Heibi, S. Peroni, Why do you cite? an investigation on citation intents and decision-making classification processes, arXiv preprint arXiv:2407.13329 (2024). [25] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, u. Kaiser, I. Polosukhin, Attention is All you Need, in: Advances in Neural Information Processing Systems, volume 30, Curran Associates, Inc., 2017. URL: https://proceedings.neurips.cc/paper_files/paper/2017/hash/ 3f5ee243547dee91fbd053c1c4a845aa-Abstract.html. [26] A. Radford, K. Narasimhan, T. Salimans, I. Sutskever, et al., Improving language understanding by generative pre-training (2018). [27] J. Devlin, M.-W. Chang, K. Lee, K. Toutanova, BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding, in: J. Burstein, C. Doran, T. Solorio (Eds.), Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), Association for Computational Linguistics, Minneapolis, Minnesota, 2019, pp. 4171–4186. URL: https://aclanthology.org/N19-1423/. doi:10.18653/v1/N19- 1423. [28] W. Ma, S. Liu, W. Wang, Q. Hu, Y. Liu, C. Zhang, L. Nie, Y. Liu, ChatGPT: Understanding Code Syntax and Semantics, ???? URL: https://www. semanticscholar.org/paper/ChatGPT%3A-Understanding-Code-Syntax-and-Semantics-Ma-Liu/ a7088c0dc34115ce38e6a37feba3c03497708047. [29] H. Touvron, T. Lavril, G. Izacard, X. Martinet, M.-A. Lachaux, T. Lacroix, B. Rozière, N. Goyal, E. Hambro, F. Azhar, A. Rodriguez, A. Joulin, E. Grave, G. Lample, LLaMA: Open and Eficient Foundation Language Models, 2023. URL: http://arxiv.org/abs/2302.13971. doi:10.48550/arXiv. 2302.13971, arXiv:2302.13971 [cs]. [30] P. Lewis, E. Perez, A. Piktus, F. Petroni, V. Karpukhin, N. Goyal, H. Küttler, M. Lewis, W.t. Yih, T. Rocktäschel, S. Riedel, D. Kiela, Retrieval-Augmented Generation for KnowledgeIntensive NLP Tasks, 2021. URL: http://arxiv.org/abs/2005.11401. doi:10.48550/arXiv.2005. 11401, arXiv:2005.11401 [cs]. [31] A. Ramesh, M. Pavlov, G. Goh, S. Gray, C. Voss, A. Radford, M. Chen, I. Sutskever, Zero-Shot Text-toImage Generation, 2021. URL: http://arxiv.org/abs/2102.12092. doi:10.48550/arXiv.2102.12092, arXiv:2102.12092 [cs].

A. Appendix

Number of citations Number of publications Number of venue-level references Number of venue-level references Number of works referenced Average researcher h-index Number of co-institutions Number of institutions cited Number of institution-level references Maximum researcher h-index Number of researchers Number of institution-level references Median researcher h-index Number of authors cited Minimum researcher h-index

Name

Number of publications Number of citations Number of co-institutions Number of researchers Average researcher h-index Maximum researcher h-index Minimum researcher h-index Median researcher h-index Number of works cited Number of author-level citations Number of authors cited Number of venue-level citations Number of venues cited Number of institution-level citations Number of institutions cited

Description

[1]

M. L.

Littman , I. Ajunwa, G. Berger,

Boutilier ,

Currie ,

Doshi-Velez ,

Hadfield ,

M. C.

Horowitz ,

Isbell ,

Kitano ,

Levy ,

Lyons , M. Mitchell,

Shah ,

Sloman ,

Vallor , T. Walsh, Gathering Strength, Gathering Storms: The One Hundred Year Study on Artificial Intelligence (AI100) 2021 Study Panel Report ( 2022 ). URL: https://arxiv.org/abs/2210.15767. doi: 10 .48550/ ARXIV.2210.15767, publisher: arXiv Version Number: 1 .

[2]

Patel , Google Gemini Eats The World - Gemini Smashes GPT-4 By 5X , The

GPU

-Poors, 2023 . URL: https://www.semianalysis.com/p/google -gemini-eats-the-world-gemini.

[3]

Movva ,

Balachandar ,

Peng , G. Agostini,

Garg , E. Pierson, Topics, Authors, and Networks in Large Language Model Research: Trends from a Survey of 17K arXiv Papers ( 2023 ). URL: https: //arxiv.org/abs/2307.10700. doi: 10 .48550/ARXIV.2307.10700, publisher: arXiv Version Number: 2 .

[4]

Ahmed ,

Wahed ,

N. C.

Thompson , The growing influence of industry in AI research , Science 379 ( 2023 ) 884 - 886 . URL: https://www.science.org/doi/10.1126/science.ade2420. doi: 10 . 1126/science.ade2420, publisher : American Association for the Advancement of Science .

[5]

Ignat ,

Jin ,

Abzaliev ,

Biester ,

Castro ,

Deng ,

Gao ,

A. E.

Gunal ,

He ,

Kazemi ,

Khalifa ,

Koh ,

Lee ,

Liu ,

D. J.

Min ,

Mori ,

J. C.

Nwatu ,

Perez-Rosas ,

Shen ,

Wang ,

Wu ,

Mihalcea , Has It All Been Solved? Open NLP Research Questions Not Solved by Large Language Models , in: N. Calzolari , M.- Y.

Kan , V.

Hoste , A.

Lenci , S.

Sakti , N. Xue (Eds.), Proceedings of the 2024 Joint International Conference on Computational Linguistics , Language Resources and Evaluation (LREC-COLING 2024), ELRA and ICCL , Torino , Italia, 2024 , pp. 8050 - 8094 . URL: https://aclanthology.org/ 2024 .lrec-main. 708 /.

[6]

Togelius ,

G. N.

Yannakakis , Choose Your Weapon: Survival Strategies for Depressed AI Academics ( 2023 ). URL: https://arxiv.org/abs/2304.06035. doi: 10 .48550/ARXIV.2304.06035, publisher: arXiv Version Number: 1 .

[7]

Gururaja ,

Bertsch ,

Na ,

Widder , E. Strubell, To build our future, we must know our past: Contextualizing paradigm shifts in natural language processing , in: H. Bouamor , J. Pino , K. Bali (Eds.), EMNLP, acl, Singapore, 2023 .

[8]

Latour , Science in action: how to follow scientists and engineers through society , Harvard University Press, Cambridge, Mass, 1987 .

[9]

Huang ,

Zhu ,

Wang , Evaluating scientific impact of publications: combining citation polarity and purpose , Scientometrics 127 ( 2022 ) 5257 - 5281 . URL: https://doi.org/10.1007/ s11192-021-04183-8. doi: 10 .1007/s11192- 021- 04183- 8.

[10]

Jiang , J. Liu, Extracting the evolutionary backbone of scientific domains: The semantic main path network analysis approach based on citation context analysis , Journal of the Association for Information Science and Technology 74 ( 2023 ) 546 - 569 . URL: https://asistdl.onlinelibrary.wiley. com/doi/10.1002/asi.24748. doi: 10 .1002/asi.24748.

[11]

K. S.

Jones , Natural Language Processing: A Historical Review , in: Current Issues in Computational Linguistics: In Honour of Don Walker, Springer, Dordrecht, 1994 , pp. 3 - 16 . URL: https://link. springer.com/chapter/10.1007/978-0- 585 -35958- 8 _1. doi: 10 .1007/978- 0- 585 - 35958- 8 _ 1 .