Speak Well or Be Still: Solving Conversational AI with Weighted Attribute Grammars (Poster) Vadim Zaytsev Formal Methods and Tools, University of Twente, The Netherlands There is a growing need to specify models of possible conversations with non-human entities. Such models have a difficult task to set the bar for correctness yet tolerate conversations with only partial conformance to it, and accommodate computations that non-human entities perform during the conversation on several possibly independent emergent models with unrelated flows of information in them. As it turns out, this is possible to specify formally in a fairly concise way with weighted attribute grammars [6]. In this paper, a variant of those is presented, called WAGIoT, that combines the power of analytic, generative, attribute, weighted and probabilistic grammars in one DSML. Conversation entities are an essential part of smart IoT systems. They come in many forms, largely synonymous: conversational AI, virtual digital assistants, interactive agents, smart bots, chatbots, etc. In the presence of the Turing test [17] as the ultimate goal of artificial intelli- gence, conversation programs became an iconic example of an AI system very early. Notable milestones shaping and eventually commoditising the trend, were ELIZA [19], PARRY [4], A.L.I.C.E. [18], Wolfram Alpha [20], IBM Watson [9], Siri [3], Cortana [12], Alexa [2], Google Assistant [7], Alisa [22] and Bixby [15]. Conversation agents are needed in many areas from smart homes to game design. We refer to the excellent recent overview of this research direction by Adamopoulou and Moussiades [1] and focus now on the relation between the linguistic component and the operational logic of the conversation entity. Looking at the problem linguistically, the conversation can be encoded as an automaton with states representing internal states of the conversation component, and transitions annotated with inputs (coming from the user or an edge device) and outputs (being sent to the user or to actuator). In the computation theory such automata are called Mealy machines [11] if they are deterministic and have a finite number of states. Both limitations are unfortunately too crippling, so all the substantial body of research on Mealy machines cannot be applied directly. What might be theoretically more feasible, is an input/output extension of pushdown transition systems or process rewrite systems [10], that can handle both infinite/uncountable number of states or transitions, have enough memory to handle complex tasks intelligently, and still represent a strict subclass of Turing machines such that reachability is decidable. The lack of available theories pushed people to consider hybrid setups. For instance, actual derivations can be handled by a grammar, but the grammar rules that get applied, must follow MeSS’22: International Workshop on MDE for Smart IoT Systems, 5 July 2022, Nantes, France $ vadim@grammarware.net (V. Zaytsev) Β€ https://grammarware.net/ (V. Zaytsev)  0000-0001-7764-4224 (V. Zaytsev) Β© 2022 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). CEUR Workshop Proceedings http://ceur-ws.org ISSN 1613-0073 CEUR Workshop Proceedings (CEUR-WS.org) an associated coloured graph [21] or a path in a Petri net [5]. The contemporary systems used in the industry, like RiveScript [13], are also hybrid in nature: the conversation is specified as request-response pairs (akin to the event-based paradigm in grammarware [23]) with ad hoc computations on global data. Powerful off-the-shelf AI packages like Dialogflow [8], lifting natural language understanding tasks to intents and their fulfillment, and using a smart knowledge connector to incorporate existing data, allow these computations to be arbitrarily complex. For example, Salvi et al build Jamura [14], a conversational smart home assistant using Telegram API for collecting user input, Dialoglflow Agents for processing it and the ThingSpeak platform to connect to the server and the clients. One of the heterogeneous approaches is binary context-free grammars [16] which combine two generative grammars into one mathematical object. Essentially our proposal is to combine attribute grammars (that can propagate and share data in a very controlled fashion), weighted grammars (that provide highly controllable nondeterminism), analytic grammars (for parsing user inputs) and generative grammars (for producing answers). Our domain-specific language for conversation models is called WAGIoT, and is in fact a tailored implementation of Weighted Attribute Grammars [6]. It allows conversation designers to specify interactions with actuators, sensors and users. A typical WAGIoT model can be seen on Figure 1. There are many model transformations that infer enough information from this provided model in order to make it formally executable. For instance: β—‡ a synthesized attribute 𝑛. in getname and an inherited attribute 𝑛˙ in time have the same name; the given grammar is represented as a directed graph of nonterminals, in which WAGIoT normaliser finds the shortest path between these two places and makes sure the information is properly propagated; β—‡ analytic nonterminals such as greet and stop have multiple branches, which get assigned counters; these are increased each time a branch is chosen, and this information is always available elsewhereβ€”wherever there is a need to build a model of the human user’s behaviour/speech patterns; β—‡ all the unassigned weights are calculated based on available information; for instance, time branch in rule 10 gets a probability (οΈ€1 ∢ 1 + π‘€βŒ‹οΈ€ because 𝑀 is a dynamic value, so the only option is to add the default 1; if rule 11 had the weight (οΈ€1 ∢ 3βŒ‹οΈ€ instead of (οΈ€π‘€βŒ‹οΈ€, then rule 10 would get a weight (οΈ€2 ∢ 3βŒ‹οΈ€ to compensate for the missing static part; β—‡ types of attributes are inferred: 𝑛 is determined to be a string due to a built-in nonterminal Id, and 𝑀 is an integer because it is used as weight; β—‡ code blocks between J and K are expected to conform to a special interface with methods responding to their use in generative and analytic contexts, and otherwise are lowered to strings, which is what happens in rules 10–11. Our preliminary experiments show that these transformations hide details that otherwise overwhelm those who are supposed to write such models. Although we claim that the simplicity of RiveScript [13] pushes its users to combine it with informal hacks, exposing too much of the complexity at once is just as damaging. More work is required to find the best balance, as well as to pursue the obvious enhancements like replacing fixed terminals with signals from a NLP model. Our ongoing endeavours are made public at https://github.com/grammarware/wagl. 𝑆 ← setup activityβˆ— stop. (1) setup ← greet? getname. (2) greet ← 'hello'. (3) greet ← 'good morning'. (4) greet β†’ 'greetings, human!' 'what is your name?'. (5) getname ← 'I am' 𝑛. ∢= Id. (6) getname β†’ 'nice to meet you,' 𝑛. . (7) activity ← time ⋃︀ β‹―. (8) time ← 'what time is it?'. (9) time β†’ 'it is' Jπ·π‘Žπ‘‘π‘’π‘‡ π‘–π‘šπ‘’.𝑁 π‘œπ‘€K ',' 𝑛 Λ™ β–Ά 𝑀 ∢= 5. (10) time β†’ (οΈ€π‘€βŒ‹οΈ€'it is' Jπ·π‘Žπ‘‘π‘’π‘‡ π‘–π‘šπ‘’.𝑁 π‘œπ‘€.π»π‘œπ‘’π‘ŸK 'oβ€² clock' β–Ά 𝑀 ∢= 𝑀 βˆ’ 1. (11) stop ← 'stop' ⋃︀ 'off'. (12) Figure 1: A simplified WAGIoT grammar for a holistic example showcasing the notation. All ←-rules are analytic, all β†’-rules are generative. Having multiple analytic rules means that any of them can be appliedβ€”as in (3–4). Having multiple generative rules means one of them is chosen at random with probabilities following rule weightsβ€”as in (10–11). Rule (1) contains a regular right part (a Kleene star). Rule (12) contains a shorthand notation for multiple rules of the same kind for the same nonterminal (a Backus bar). The ellipsis in rule (8) designates that this is but a fragment of a larger grammar. ∢= in rules (6) and (10–11) denotes attribute assignment. π‘₯Λ™ are inherited attributes (propagated downwards in the tree); π‘₯. are synthesized attributes (propagated upwards). Weight is printed in square brackets. References [1] E. Adamopoulou, L. Moussiades, An Overview of Chatbot Technology, in: AIAI, Springer, 2020, pp. 373–383. [2] Amazon, Amazon Alexa Voice AI, https://developer.amazon.com/en-US/alexa, 2014. [3] Apple, Siri, https://www.apple.com/siri/, 2011. [4] K. M. Colby, S. Weber, F. D. Hilf, Artificial Paranoia, Artificial Intelligence 2 (1971) 1–25. [5] J. Dassow, S. Turaev, π‘˜-Petri Net Controlled Grammars, in: LATA, LNCS 5196, Springer, 2008, pp. 209–220. [6] M. Gerhold, V. Zaytsev, Towards Weighted Attribute Grammars, Under review, 2022. [7] Google, Google Assistant, your own personal Google, https://assistant.google.com, 2016. [8] Google, Google Cloud: Dialogflow, https://cloud.google.com/dialogflow/docs/, 2017. [9] IBM, IBM Watson, https://www.ibm.com/watson, 2011. [10] R. Mayr, Process Rewrite Systems, Information and Computation 156 (2000) 264–286. [11] G. H. Mealy, A Method for Synthesizing Sequential Circuits, Bell System Tech Journal 34 (1955) 1045–1079. [12] Microsoft, Your Personal Productivity Assistant, https://www.microsoft.com/en-us/cortana, 2014. [13] N. Petherbridge, RiveScript, https://www.rivescript.com, 2018. [14] S. Salvi, Geetha V, Sowmya Kamath S, Jamura: A Conversational Smart Home Assistant Built on Telegram and Google Dialogflow, in: TENCON, IEEE, 2019, pp. 1564–1571. [15] Samsung, Bixby | Apps & Services, http://bixby.samsung.com/, 2017. [16] S. Turaev, R. Abdulghafor, A. A. Alwan, A. A. Almisreb, Y. Gulzar, Binary Context-Free Grammars (2020). [17] A. M. Turing, Computing Machinery and Intelligence, Mind 59 (1950) 433–460. [18] R. S. Wallace, The Anatomy of A.L.I.C.E., in: Parsing the Turing Test, Springer, 2009, pp. 181–210. [19] J. Weizenbaum, ELIZA β€” A Computer Program for the Study of Natural Language Communication between Man and Machine, CACM 9 (1966) 36–45. doi:10.1145/365153.365168. [20] Wolfram Research, Wolfram|Alpha: Computational Intelligence, https://www.wolframalpha.com, 2009. [21] D. Wood, A Note on Bicolored Digraph Grammar Systems, IJCM 3 (1973) 301–308. [22] Yandex, Alisa β€” Voice Assistant from the Yandex company, https://yandex.ru/alice, 2017. [23] V. Zaytsev, Event-Based Parsing, in: REBLS, 2019. doi:10.1145/3358503.3361275.