=Paper=
{{Paper
|id=Vol-3432/paper38
|storemode=property
|title=Generalizable Neuro-Symbolic Systems for Commonsense Question Answering
|pdfUrl=https://ceur-ws.org/Vol-3432/paper38.pdf
|volume=Vol-3432
|authors=Alessandro Oltramari,Jonathan Francis,Filip Ilievski,Kaixin Ma,Roshanak Mirzaee
|dblpUrl=https://dblp.org/rec/conf/nesy/OltramariFIMM23
}}
==Generalizable Neuro-Symbolic Systems for Commonsense Question Answering==
Generalizable Neuro-Symbolic Systems for Commonsense Question Answering Alessandro Oltramari1,2,3,**, Jonathan Francis1,2, Filip Ilievski4, Kaixin Ma2 and Roshanak Mirzaee5 1 Bosch Research and Technology Center, Pittsburgh, USA 2 Carnegie Mellon University, Pittsburgh, USA 3 Italian National Research Council, Institute of Cognitive Science and Technology, Trento, Italy 4 Information Sciences Institute, Viterbi School of Engineering, University of Southern California, Marina del Rey, USA 5 Department of Computer Science and Engineering, Michigan State University, East Lansing, USA 1. Extended abstract In this contribution [1], we analyze different methods for integrating large language models (LLMs) and knowledge graphs (KGs), presenting a quantitative and qualitative evaluation of these neuro-symbolic strategies on a broad spectrum of question answering benchmark datasets focused on common sense reasoning. We illustrate how the combination of data-driven algorithms and symbolic knowledge is necessary to enable domain generalizability and robustness in downstream tasks. Our work, despite being prior to the disruptive innovation triggered by ChatGPT [2], remains actual, as it stems from a comprehensive analysis of the intrinsic problems that LLMs have in dealing with high-level forms of inferences requiring spatial, temporal, analogical, causal reasoning. We argue that such problems are dependent on the inner limitations of the neural architectures behind those models and cannot simply be solved by increasing the size of the training data, or by further scaling the number of parameters. In our original paper we claim that two orthogonal extensions are required for LLMs: first, augmenting them with symbolic knowledge, which is available in large amounts thanks to open- source commonsense KGs; second, extending the resulting neuro-symbolic language models with components that can dynamically perform reasoning. We refer to the former direction as horizontal augmentation, as it focuses on expanding the coverage of the structured knowledge available to LLMs methods, and to the latter as vertical augmentation, as it explores in depth how specific forms of reasoning can be enabled. Our paper mostly focuses on horizontal augmentation, but it also lays the foundations of vertical augmentation. In regards of the former, we study techniques like attention- based knowledge injection, and fine-tuning LLMs with synthetic question-answer pairs generated from KGs: in both techniques, symbolic knowledge is typically translated into latent vectorial representations. However, if sub-symbolic expressions can augment training signals with features transformed from explicit semantic content, such knowledge infusion process doesn’t mandate how inferences on the learned knowledge should be performed. Such mechanisms are typically executed using logic-based reasoning, e.g., Region-Connection-Calculus for spatial knowledge [3], Allen’s axioms for temporal (or spatial) knowledge [4], etc. (relevant work exists showing how deep learning models can replicate logical reasoning, e.g., [5, 6], but it doesn’t follow that any form of reasoning should be reduced to sub-symbolic learning – or at least this is an assumption only for some closely paired neuro-symbolic systems). Implementing inference mechanisms1, and integrating them with large-scale knowledge infusion methods, is what we advocate for in the future of neuro-symbolic AI research and development. * NeSy 2023, 17th International Workshop on Neural-Symbolic Learning and Reasoning, Certosa di Pontignano, Siena, Italy *Corresponding author EMAIL: alessandro.oltramari@us.bosch.com (A. Oltramari); jon.francis@us.bosch.com (J. Francis); ilievski@isi.edu (F. Ilievski); kaixinm@andrew.cmu.edu (K. Ma); rk.mirzaee.m@gmail.com (R. Mirzaee) ORCID: 0000-0003-1559-4852 (A. Oltramari); 0000-0002-0556-1136 (J. Francis); 0000-0002-1735-0686 (F. Ilievski); 0000-0001-7414- 5673 (K. Ma); 0000-0002-7330-7818 (R. Mirzaee) ©️ 2023 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). CEUR Workshop Proceedings http://ceur-ws.org ISSN 1613-0073 CEUR Workshop Proceedings (CEUR-WS.org) References [1] Alessandro Oltramari, Jonathan Francis, Filip Ilievski, Kaixin Ma, Roshanak Mirzaee: Generalizable Neuro-symbolic Systems for Commonsense Question Answering. In Pascal Hitzler, Md Kamruzzaman Sarker (eds.), Neuro-Symbolic Artificial Intelligence: The State of the Art. Frontiers in Artificial Intelligence and Applications Vol. 342, IOS Press, Amsterdam, 2022. [2] “ChatGPT – Release Notes”. Archived from the original on May 4, 2023. Retrieved May 4, 2023. [3] A. G. Cohn, B. Bennett, J. Gooday, N. M. Gotts, Qualitative spatial representation and reasoning with the region connection calculus, geoinformatica 1 (1997) 275–316. [4] J. F. Allen, G. Ferguson, Actions and events in interval temporal logic, Journal of logic and computation 4 (1994) 531–579. [5] M. Ebrahimi, A. Eberhart, P. Hitzler, On the capabilities of pointer networks for deep deductive reasoning, arXiv preprint arXiv:2106.09225 (2021). [6] A. d. Garcez, S. Bader, H. Bowman, L. C. Lamb, L. de Penning, B. Illuminoo, H. Poon, C. Ger- son Zaverucha, Neural-symbolic learning and reasoning: A survey and interpretation, Neuro-Symbolic Artificial Intelligence: The State of the Art 342 (2022) 1.