A Reactive Cognitive Architecture based on Natural Language Processing for the task of Decision-Making using a Rich Semantic Carmelo Fabio Longoa , Francesco Longob and Corrado Santoroa a Department of Mathematics and Computer Science, University of Catania, Viale Andrea Doria, 6, 95125 Catania, Italy b Department of Engineering, University of Messina, Contrada di Dio, S. Agata, 98166 Messina, Italy Abstract The field of cognitive architectures is rich of approaches featuring a wide range of typical abilities of human mind, like perception, action selection, learning, reasoning, meta-reasoning and others. However, those leveraging Natural Language Processing are quite limited in both domain and reasoning capabilities. In this work, we present a cognitive architecture called CASPAR, based on a Belief-Desire-Intention framework, capable of reactive reasoning using a highly descriptive semantic made of First Order Logic predicates parsed from natural language utterances. Keywords Cognitive Architecture, Natural Language Processing, Artificial Intelligence, First Order Logic, Internet of Things 1. Introduction In the last decade, a large number of devices connected together and controlled by AI has entered in millions of houses: the pervasive market of Internet of Things (IoT). Such a phenomenon is extended also in domains other than the domestic one, such as smart cities, remote e-healthcare, industrial automation, and so on. In most of them, especially the usual domestic ones, vocal assistants assume an important role, because voice is the most natural way to give the user the feeling to deal with an intelligent sentient being who cares about the proper functioning of the home environment. But how intelligent are these vocal assistants actually? Although there can be more definitions of intelligence, in this work we are interested only in those related to autonomous agents acting in the scope of decision-making. Nowadays, companies producing vocal assistants aim more at increasing their pervasiveness than at improving their native reasoning capabilities; with reasoning capabilities, we can intend not only the ability to infer the proper association command → plan from utterances, but also to be capable of combining facts with rules in order to infer new knowledge and help the user in decision-making tasks. Except the well known cloud-based vocal assistants [1], other kind of solutions [2, 3, 4] are based on neural models exclusively trained on the domotic domain; or they exploit chat WOA 2020: Workshop “From Objects to Agents”, September 14–16, 2020, Bologna, Italy " fabio.longo@unict.it (C. F. Longo); flongo@unime.it (F. Longo); santoro@dmi.unict.it (C. Santoro)  0000-0002-2536-8659 (C. F. Longo); 0000-0001-6299-140X (F. Longo); 0000-0003-1780-5406 (C. Santoro) © 2020 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). CEUR Workshop Proceedings http://ceur-ws.org ISSN 1613-0073 CEUR Workshop Proceedings (CEUR-WS.org) 201 Carmelo Fabio Longo et al. 201–218 engines [5, 6] whose understanding skills are strictly depending on syntax. This makes the range of their capabilities quite limited. In light of the above, in this paper our aim is the design of a cognitive architecture, called CAS- PAR, based on Natural Language Processing (NLP), that makes it possible the implementation of intelligent agents able to outclass the available ones in performing deductive activities. Such agents could be used for both domotic purposes and any other kind of applications involving common deductive processes based on natural language. As a further motivation, we have to highlight that, as claimed in [7], cognitive architectures have been so far mainly used as research tools, and very few of them have been developed outside of academia; moreover, none of them has been specifically designed for IoT. Of course, most of them have features and resources which could be exploited in such a domain, but the starting motivations were different from ours. Although cognitive architectures should be distinguished from models that implement them, our architecture can be used as domotic agent as is, after the definitions of both the involved entities and the I/O interfaces. This paper is structured as follows: Section 2 describes the state of the art of related literature; Section 3 shows in detail all the architecture’s components and underlying modules; Section 4 shows the architecture reasoning heuristic in the presence of clauses made of composite predi- cates, taking into account possible argument substitutions as well; Section 5 summarizes the content of the paper and provides our conclusions, together with future work perspectives. A Python implementation of CASPAR is also provided for research purposes in a Github repository1 . 2. Related work The number of existing cognitive architectures has reached several hundreds according to the authors of [7]. Among the most popular ones, which also influenced several subsequent works, there are SOAR, CLARION and LIDA, mentioned in a theoretical comparison in [8]. Most of them got inspired either by neuroscience or psychanalysis/philosophy studies; the former are surely less fancy, being supported by scientific data regarding functions of brain modules in specific conditions and their interactions. The Integrated Information Theory [9] provides even a metric Phi to evaluate the consciousness level of a cognitive system, which would be proportional to those overall interactions. In this section, we will focus mostly on those architectures implementing Reasoning/Action Selection, Natural Language Processing and Decision-Making, being the main basis on which CASPAR has been built. In [10] the authors describe three different spoken dialog systems, one of them based on the FORR architecture and designed to fulfill the task of ordering books from the public library by phone. All the three dialog systems are based on a local Speech-to-Text engine called PocketSphinx which is notoriously less performing than cloud-based systems [11]. This leads to a greater struggle to reduce the bias between user’s request and result. The authors of [12] present a computational model called MoralIDM, which integrates multiple AI techniques to model human moral decision-making, by leveraging a two-layer 1 http://www.github.com/fabiuslongo/pycaspar 202 Carmelo Fabio Longo et al. 201–218 Physical Direct commands ASR Sensors Parser Devices Dependency Routines Parser Parser Devices Groups Beliefs KB Uniquezer PHIDIAS Engine Smart Environment Smart Interface Home Sensor MST Builder Instances Clauses KB STT Definite clauses FOL Builder FOL Reasoner Front-End Builder Translation Service Reactive Reasoner Cognitive Reasoner Figure 1: The Software Architecture of CASPAR inference engine which takes into account prior cases decisions and a knowledge base with a formal representation of moral quality-weighted facts. Such facts are extracted from natural language by using a semi-automatic translator from simplified English (which is the major weakness of such approach) scenarios into predicate calculus. The DIARC architecture [13] has been designed for addressing the issue of recognizing morally and socially charged situations in human-robot collaborations. Although it exploits several well known NLP resources (such as Sphinx, Verbnet, and Framenet), it has been tested only on trivial examples in order to trigger robot reactions, using an ad-hoc symbolic representation of both known and perceived facts. In general, probing the existing cognitive architectures leveraging NLP, we have found that most of them are limited in both domain of application and in term of semantic complexity. 3. The Architecture The name that has been chosen for the architecture presented in this paper is CASPAR. It derives from the following words: Cognitive Architecture System Planned and Reactive, whom summarize its two main features. In Figure 1, all interacting components are depicted, filled with distinct colours. The main component of this architecture, namely the Reactive Reasoner, acts as "core router" by delegating operations to other components, and providing all needed functions to make the whole system fully operative. This architecture’s Knowledge Base (KB) is divided into two distinct parts operating separately, which we will distinguish as Beliefs KB and Clauses KB: the former contains information of physical entities which affect the agent and which we want the agent to affect; the latter contains 203 Carmelo Fabio Longo et al. 201–218 conceptual information not perceived by agent’s sensors, but on which we want the agent to make logical inference. The Beliefs KB provides exhaustive cognition about what the agent could expect as input data coming from the outside world; as the name suggests, this cognition is managed by means of proper beliefs that can - in turn - activate proper plans in the agent’s behaviour. The Clauses KB is defined by the means of assertions/retraction of nested First Order Logic (FOL) definite clauses, which are possibly made of composite predicates, and it can be interro- gated providing answer to any query (True or False). The two KBs represent, somehow, two different kinds of human being memory: the so called procedural memory or implicit memory[14], made of thoughts directly linked to concrete and physical entities; the conceptual memory, based on cognitive processes of comparative evaluation. As well as in human being, in this architecture the two KBs can interact with each other in a very reactive decision-making process. 3.1. The Translation Service This component (left box in Figure 1) is a pipeline of five modules with the task of taking a sound stream in natural language and translating it in a neo-davidsonian FOL expression inheriting the shape from the event-based formal representation of Davidson [15], where for instance the sentence: Brutus stabbed suddenly Caesar in the agora (1) is represented by the following notation: ∃e stabbed(e, Brutus, Caesar) ∧ suddenly(e) ∧ in(e, agora) The variable e, which we define davidsonian variable, identifies the verbal action related to stabbed. In the case a sentence contains more than one verbal phrases we’ll make usage of indexes for distinguish e𝑖 from e𝑗 with 𝑖 ̸= 𝑗. As for the notation used in this work, it does not use ground terms as arguments of the predicates, in order to permit the sharing of different features related to the same term like it follows, whether we include the adjective evil: ∃e stabbed(e, Brutus(x), Caesar(y)) ∧ evil(x) ∧ suddenly(e) ∧ in(e, agora(z)) which can also be represented, ungrounding the verbal action arguments, as it follows: ∃e stabbed(e, x, y) ∧ Brutus(x) ∧ Caesar(y) ∧ evil(x) ∧ suddenly(e) ∧ in(e, z) ∧ agora(z) Furthermore, in the notation used for this work each predicate label is in the form L:POS(t), where L is a lemmatized word and POS is a Part-of-Speech (POS) tag from the Penn Treebank tagset[16]. 204 Carmelo Fabio Longo et al. 201–218 The first module in the pipeline, i.e., the Automatic Speech Recognition [17, 18, 19] (ASR), allows a machine to understand the user’s speech and convert it into a series of words. The second module is the Dependency Parser, which aims at extracting the semantic relation- ships, namely dependencies, between all words in a utterance. In [20], the authors present a comparative analysis of ten leading statistical dependency parsers on a multi-genre corpus of English. The third module, the Uniquezer, aims at renaming all the entities within each dependency in order to make them unique. Such a task is mandatory to ensure the correctness of the outcomes of the next module in the pipeline (the Macro Semantic Table), whose data structures need a distinct reference to each entity coming from the dependency parser. The fourth module, defined as MST Builder, has the purpose to build a novel semantic structure defined as Macro Semantic Table (MST), which summarizes in a canonical shape all the semantic features in a sentence, starting from its dependencies, in order to derive FOL expressions. Here is a general schema of a MST, referred to the utterance u: MST(u) = {ACTIONS, VARLIST, PREPS, BINDS, COMPS, CONDS} where ACTIONS = [(label𝑘 , e𝑘 , x𝑖 , x𝑗 ),...] VARLIST = [(x1 , label1 ),...(x𝑛 , label𝑛 )] PREPS = [(label𝑗 , (e𝑘 | x𝑖 ), x𝑗 ),...] BINDS = [(label𝑖 , label𝑗 ),...] COMPS = [(label𝑖 , label𝑗 ),...] CONDS = [e1 , e2 ,...] All tuples inside such lists are populated with variables and labels whose indexing is considered disjoint among distinct lists, although there are significant relations which will be clarified later. The MST building takes into account also the analysis done in [21] about the so-called slot allocation, which indicates specific policies about entity’s location inside each predicate, depending on verbal cases. This is because the human mind, in the presence of whatever utterance, is able to populate implicitly any semantic role (identified by subject/object slots) taking part in a verbal action, in order to create and interact with a logical model of the utterance. In this work, by leveraging a step-by-step dependencies analysis, we want to create artificially such a model, to give an agent the chance to make logical inference on the available knowledge. All the dependencies used in this paper are part of the ClearNLP[22] tagset, which is made of 46 distinct entries. For instance, considering the dependencies of 1: nsubj(stabbed, Brutus) ROOT(stabbed, stabbed) advmod(stabbed, suddenly) dobj(stabbed, Caesar) prep(stabbed, In) det(agora, The) pobj(in, agora) 205 Carmelo Fabio Longo et al. 201–218 from the couple nsubj/dobj it is possible to create new a tuple inside ACTIONS as it follows, taking also in account of variables indexing counting: (stabbed, e1 , x1 , x2 ) and inside VARLIST as well: (x1 , Brutus) (x2 , Caesar) Similarly, after an analysis of the couple prep/pobj it is possibile to create further tuples inside PREPS like it follows: (in, e1 , x3 ) and inside VARLIST: (x3 , agora) The dependency advmod contains informations about the verb (stabbed) is going to modify by means the adverb suddenly. In light of this, a further tuple inside VARLIST will be created as it follows: (e1 , suddenly) As for the BINDS list, it contains tuples with a quality modifiers role: in the case the 1 had the brave Caesar as object, a further dependency amod will be created as it follow: amod(Caesar, brave) In this case, a bind between Caesar and brave will be created inside BINDS as it follows: (Caesar, brave) As with BINDS, COMPS contains tuples of terms related to each other, but in this case they are part of multi-word nouns like Barack Hussein Obama, whose nouns after the first will be classified as compound by the dependency parser. As for the CONDS lists, it contains davidsonian variable whose related predicates subordinate the remaining others. For instance in the presence of utterances like: if the sun shines strongly, Robert drinks wine or while the sun shines strongly, Helen smiles 206 Carmelo Fabio Longo et al. 201–218 in both cases, the dependency mark will give information about subordinate conditions related to the verb shines, which are mark(shines, If) and mark(shines, while). In those cases, the davidsonian variable related to shines will populate the list CONDS. In the same way, in presence of the word when, a subordinate condition might be inferred as well; but since any adverbs are classified as advmod (as we have seen for suddenly before), it will be considered as subordinate condition only when its POS is WRB and not RB; the former denotes a wh-adverb, the latter a qualitative adverb. The fifth and last module, defined as FOL Builder, aims to build FOL expressions starting from the MSTs. Since (virtually) all approaches to formal semantics assume the Principle of Compositionality2 , formally formulated by Partee [23], every semantic representation can be incrementally built up when constituents are put together during parsing. In light of the above, it is possible to build FOL expressions straightforwardly starting from a MST, which summarizes all semantic features extracted during a step-by-step dependencies analysis. For the rest of the paper, the labels inside the MST tuples will be in the form of lemma:POS. Then, for instance, instead of stabbed we’ll have stab:VBD, where stab is the lemmatization of stabbed and VBD is the POS representing a past tense. For each tuple (var, lemma:POS) in VARLIST the following predicate will be created: lemma:POS(var) which represents a noun, such as tiger:NN(x1 ) or Robert:NNP(x1 )3 . var can also be a davidsonian variable when POS has the value of RB. In such cases, the tuples represent adverbs, such as Hardly:RB(e1 ) or Slowly:RB(e2 ). For each tuple (lemma:POS, dav, subj, obj) in ACTIONS, the following predicate will be created: lemma:POS(dav, subj, obj) representing a verbal action, such as be:VBZ(e1 , x1 , x2 ) or shine:VBZ(e2 , x3 , x4 ). For each tuple (lemma:POS, dav/var, obj) in PREPS the following predicate will be created: lemma:POS(dav/var, obj) where dav/var is a variable either in a tuple of ACTIONS or of VARLIST, respectively, while obj is a variable in a tuple of VARLIST. Such predicates represent verbal/noun prepositions. For each tuple (lemma:POS1 ,lemma:POS2 ) in COMPS, whose first entity lemma:POS1 is in a tuple of VARLIST, a predicate will be created as follows: lemma:POS2 (var) where var is the variable of a the tuple in VARLIST with lemma:POS1 as second entity. In case of multi-word nouns, each further noun over the first of them in VARLIST will be encoded 2 “The meaning of a whole is a function of the meanings of the parts and of the way they are syntactically combined.” 3 Without considering entities enumeration. 207 Carmelo Fabio Longo et al. 201–218 within COMPS. As for CONDS, its usage will be explained next with an example. Let the sentence in exam be: When the sun shines strongly, Robert is happy (2) the related MST is: ACTIONS = [(shine01:VBZ, e1 , x1 , x2 ), be01:VBZ(e2 , x3 , x4 )] VARLIST = [(x1 , sun01:NN), (x2 , ?), (x3 , Robert01:NNP), (x4 , happy01:JJ)] CONDS = [e1 ] It has to be noticed the numeration of the entities within each list, as effect of the Uniquezer processing before the MST building. As final outcome we’ll have an implication like the following: shine01:VBZ(e1 , x1 , _) ∧ sun01:NN(x1 ) =⇒ be01:VBZ(e2 , x3 , x4 ) ∧ Robert01:NNP(x3 ) ∧ happy01:JJ(𝑥4 ) 3.2. The Reactive Reasoner As already mentioned, this component (central box in Figure 1) has the task of letting other modules communicate with each other; it also includes additional modules such as the Speech- To-Text (SST) Front-End, IoT Parsers (Direct Command Parser and Routine Parser), Sensor Instances, and Definite Clauses Builder. The Reactive Reasoner contains also the Beliefs KB, which supports both Reactive and Cognitive reasoning. The core of this component processing is managed by the Belief-Desire-Intention Framework Phidias[24], which gives Python programs the ability to perform logic-based reasoning (in Prolog style) and lets developers write reactive procedures, i.e., pieces of program that can promptly respond to environment events. The agent’s first interaction with the outer world happens through the STT Front-End, which is made of production rules reacting on the basis of specific words asserted by an Instance Sensor; the latter, being instance of the superclass Sensor provided by Phidias, will assert a belief called STT(X) with X as the recognized utterance, after the sound stream is acquired by the microphone and translated in text by means of the ASR. The Direct Command Parser has the task of combining FOL expressions predicates with common variables coming from the Translation Services, via a production rules system. The final outcome of such rules is a belief called INTENT, which might trigger another rule in the Smart Environment Interface. A similar behaviour is reserved to the Routine Parser, when subordinating conditions within an IoT command are detected; it produces two types of beliefs: ROUTINE and COND, linked together by a unique code. The belief ROUTINE is a sort of pending INTENT, which cannot match any production rule and execute its plan until the content of its related COND meets those of another belief asserted by a Sensor Instance and called SENSOR. Then, the ROUTINE belief will be turned into INTENT and get ready for the execution as direct command, as shown in lines 2, 3, 5, 7, 8 of Listing 1 in the Appendix 208 Carmelo Fabio Longo et al. 201–218 The Definite Clauses Builder is responsible of combining FOL expression predicates with common variables, through a production rules system, in order to produce nested definite clauses. Considering the 2 and its related FOL expression producted by the Translation Service, the production rule system of the Definite Clauses Builder, taking in account of the POS of each predicate, will produce the following nested definite clause: shine01:VBZ(sun01:NN(x1 ), _) =⇒ be01:VBZ(Robert01:NNP(x3 ), happy01:JJ(𝑥4 )) The rationale behind such a notation choice is explained next: a definite clause is either atomic or an implication whose antecedent is a conjunction of positive literals and whose consequent is a single positive literal. Because of such restrictions, in order to make MST derived clauses suitable for doing inference with the Backward-Chaining algorithm (which works only with KB made of definite clauses), we must be able to incapsulate all their informations properly. The strategy followed is to create composite terms, taking into account of the POS tags and applying the following hierarchy to every noun expression as it follows: IN(JJ(NN(NNP(x))), t) (3) where IN is a preposition label, JJ an adjective label, NP and NNP are noun and proper noun labels, x is a bound variable and t a predicate. As for the verbal actions, the nesting hierarchy will be the following: ADV(IN(VB(t1 , t2 ), t3 )) where ADV is an adverb label, IN a preposition label, VB a verb label, and t1 , t2 , t3 are predicates; in the case of intransitive or imperative verb, instead of respectively t2 or t1 , the arguments of VB will be left void. As we can see, a preposition might be related either to a noun or a verb. 3.3. The Smart Environment Interface This component (upper right box in Fig.1) provides a bidirectional interaction between the architecture and the outer world. In Listing 1 in the Appendix, a simple example is shown, where a production rules system is used as reactive tool to trigger proper plans in the presence of specific asserted beliefs. In [25] we have shown the effectiveness of this approach by leveraging the Phidias predecessor Profeta[26], even with a shallower analysis of the semantic dependecies, as well as an operations encoding via WordNet[27] in order to make the operating agent multi-language and multi-synonimous. Such an interface includes a production rules system containing different types of entities definitions and operation codes involving the entities themself, which trigger specific procedures containing high level language (e.g., lines 11 and 12 in Listing 1 in the Appendix). The latter should contain all required functions for driving each device in order to get the desired behaviour, whose implementation in this work is left to the developer. Each production rule contains also subordinating conditions defined as Active Beliefs: lemma_in_syn(X, S) checks the membership of the lemma X to the synset S, to make the rule multi-language and multi- synonimous (after having defined the entities depending on the language); while, the Active 209 Carmelo Fabio Longo et al. 201–218 Belief eval_cls(Y) lets Belief KB and Clauses KB interact with each other in a very decision- making process, where the agent decides either to execute or not the related plan within the square brackets, accordingly to the reasoning of the query Y; the latter in line 12-13 of Listing 1 in the Appendix is the representation of the sentence an inhabitant is at home. Finally, this module contains also production rules to change routines into direct commands according to the presence of specific belief related to conditionals, which might be asserted or not by some Sensor Instance (see lines 2, 3, 5, 8, 9 of Listing 1 in the Appendix). 3.4. The Cognitive Reasoner This component (bottom right box in Figure 1) allows an agent to assert/query the Clauses KB with nested definite clauses, where each predicate argument can be another predicate and so on, built by the Definite Clauses Builder module (within the Reactive Reasoner). Beyond the nominal FOL reasoning with the known Backward-Chaining algorithm, this module exploits also another class of logical axioms, the so-called assignment rules. We refer to a class of rules of the type "P is-a Q" where P is a predicate whose variable travels across one hand-side to another, with respect to the implication symbol. For example, if we want to express the concept: Robert is a man, we can use the following closed formula: ∀x Robert(x) =⇒ Man(x) (4) but before that, we must consider a premise: the introduction of such rules in a KB can be possible only by shifting all its predicates from a strictly semantic domain to a pure conceptual one, because in a semantic domain we have just the knowledge of morphological relationships between words given by their syntactic properties. Basically, we need a medium to give additional meaning to our predicates, which is provided by WordNet [27]. This allows us to make logical reasoning in a conceptual space thanks to the following functions: 𝐹𝐼 : 𝑃𝑆 −→ 𝑃𝐶 𝐹𝐴𝑟𝑔𝑠(𝐹𝐼 ) : 𝑋𝑆𝑛 −→ 𝑌𝐶𝑛 (5) F𝐼 is the Interpreter Function between the space of all semantic predicates which can be yield by the MST sets and the space of all conceptual predicates P𝐶 ; it is not injective, because a single semantic predicate might have multiple corrispondences in the codomain, one for each different synset containing the lemma in exam. F𝐴𝑟𝑔𝑠(𝐹𝐼 ) is between domain and codomain of all predicate’s argument of F𝐼 , which have equal arity. For instance, considering the FOL expression of (4): be:VBZ(e1 , x1 , 𝑥2 ) ∧ Robert:NNP(x1 ) ∧ man:NN(𝑥2 ) After an analysis of be, we find the lemma within the WordNet synset encoded by be.v.01 and defined by the gloss: have the quality of being something. This is the medium we need for the domain shifting which gives a common sense meaning to our predicates. In light of above, in the new conceptual domain given by (5), the same expression can be rewritten as: be_VBZ(d1 , y1 , y2 ) ∧ Robert_NNP(y1 ) ∧ man_NN(y2 ) 210 Carmelo Fabio Longo et al. 201–218 where be_VBZ is fixed on the value which identify y1 with y2 , Robert_NNP(x) means that x identify Robert, and man_NN(x) means that x identify a man. Considering the meaning of be_VBZ, it does make sense also to rewrite the formula as: ∀y Robert_NNP(y) =⇒ man_NN(y) (6) whrere y is a bound variable like x in (4). Having such a rule in a KB means that we can implicitly admit additional clauses having man_NN(y) as argument instead of Robert_NNP(y). The same expression, of course, in a conceptual domain can also be rewritten as a composite fact, where Robert_NNP(x) becomes argument of man_NN(x) as it follows: man_NN(Robert_NNP(y)) (7) which agrees with the hierarchy of 3 as outcome of the Definite Clauses Builder. As claimed in [28], not every KB can be converted into a set of definite clauses, because of the single-positive-literal restriction, but many KB can, like the one related to this work for the following reasons: 1. No clauses made of one single literal will ever be negative, due to the closed world assumption. Negations, initially treated like whatever adverb, when detected and related to ROOT dependency are considered as polarity inverter of verbal phrases; so, in this case, the assert will be turned into retract. 2. When the right hand-side of a clause is made by more than one literals, it is easy to demonstrate that, by applying the implication elimination rule and the principle of distributivity of ∨ over ∧, a non-definite clause can be splitted into n definite clauses (where n is the number of consequent literals). 4. Nested Reasoning and Clause Conceptual Generalizations The aim of the Cognitive Reasoner is to query a KB made of nested clauses that are also made closer to any possible related query, thanks to an appropriate pre-processing at assertion-time. Such a pre-processing, which creates a runtime expansion of the KB for every asserted clause, takes advantage of assignment rules for derivation of new knowledge. The Backward-Chaining algorithm, as is, in presence of clauses where argument manipulation is required, might not be effective. To achieve such a goal, when required clauses are not present in the KB but deductible by proper arguments substitutions, the clauses evaluations at reasoning- time can be quite heavy and not feasible in term of complexity, because the process requires unifications at every single step. Instead, we will show how, by expanding properly the KB at assertion-time, the reasoning itself can be achieved acceptably. In order to obtain such a goal, CASPAR extends the radius of the nominal Backward-Chaining through the expansion of the Clauses KB with new knowledge generated starting from arguments substitutions on copies of specific clauses already asserted before. For instance, let us consider a KB made at most of one-level4 composite predicates as follows: 4 Supposing a zero-level composite predicate be P(x). 211 Carmelo Fabio Longo et al. 201–218 P1 (G1 (x1 )) ∧ P2 (G2 (x2 )) =⇒ P3 (F3 (x3 )) P1 (F1 (x1 )) P2 (F2 (x2 )) F1 (x) =⇒ G1 (x) F2 (x) =⇒ G2 (x) H3 (x) =⇒ F3 (x) Querying such a KB with P3 (H3 (x)), for instance, using the Backward-Chaining algo- rithm, it will return False because there are neither any unifiable literals present nor as conse- quent of a clause. Instead, by exploiting H3 (x) =⇒ F3 (x), we can also query the KB with P3 (F3 (x)) which is present as consequent of the first clause and it is surely satisfied together with P3 (H3 (x)): that’s what we define as Nested Reasoning. Now, to continue the reasoning process, we should check about the premises of such clause, which is made of the conjunction of two literals, namely P1 (G1 (x1 )) and P2 (G2 (x2 )). The latters, although not initially asserted, can be obtained starting by argument substitution on copies of other clauses from the same KB. Such a process is achieved by implicitly asserting the following clauses together with P1 (F1 (x1 )) and P2 (F2 (x2 )): P1 (F1 (x1 )) =⇒ P1 (G1 (x1 )) P1 (F1 (x1 )) =⇒ P2 (G2 (x2 )) Since we cannot know in advance what a future successful reasoning requires, considering all possible nesting levels, along with the previous clauses also the so-called Clause Conceptual Generalizations will be asserted: P1 (G1 (x1 )) ∧ P2 (G2 (x2 )) =⇒ F3 (x3 ) F1 (x1 ) F2 (x1 ) where the antecedent of the implication is unchanged to hold the quality of the rule, while F1 (x1 ), F2 (x1 ), F3 (x3 ), as satisfiability contributors of respectively P1 (F1 (x1 )), P2 (F2 (x2 )), P3 (F3 (x3 )), are assumed asserted together with the latters. In other terms, the predicates: P1 , P2 , P3 can be considered as modifiers of respectively F1 , F2 , F3 . A generalization considering also the antecedent of the implicative formula is possible only through a weaker assertion of the entire formula itself, by changing =⇒ with ∧ as it follows: ∃ x1 , x2 , x3 | G1 (x1 ) ∧ G2 (x2 ) ∧ F3 (x3 ) which is not admitted as definite clause, being not a single positive literal. In any case, the mutual existence of x1 , x2 , x3 which satisfies such a conjunction, is already subsumed by the implication. After such a theoretic premise, let’s make a more practical example considering the following natural language utterance: When the sun shines hard, Barbara drinks slowly a fresh lemonade 212 Carmelo Fabio Longo et al. 201–218 Table 1 Clause Generalizations Constituents Table (A=Applied, NA=Not Applied) Hard Slowly Fresh A NA NA A A NA A NA A A A A The corresponding definite clause will be (omitting the POS tags for the sake of readability): Hard(Shine(Sun(x1), __)) =⇒ Slowly(Drink(Barbara(x3), Fresh(Lemonade(x4)))) Considering as modifiers adjectives, adverbs and prepositions, following the schema in Table 1: all the clauses generalization (corresponding to the first three rows of the table, while the forth is the initial clause) can be asserted as it follows: Hard(Shine(Sun(x1), __)) =⇒ Drink(Barbara(x3), Lemonade(x4))) Hard(Shine(Sun(x1), __)) =⇒ Slowly(Drink(Barbara(x3), Lemonade(x4))) Hard(Shine(Sun(x1), __)) =⇒ Drink(Barbara(x3), Fresh(Lemonade(x4))) As said before, the antecedent (whether existing) of all generalizations remains unchanged to hold the quality of the triggering condition, while the consequent shape will range on all possible variations of its modifiers, which will be 2𝑛 with 𝑛 as number of modifiers. Here the adverb Hard, being common part of all the antecedents composition, is always Applied. Although in such a case the number of generalizations is equal to 4, in general it might be quite higher: it has been observed, after an analysis of more text corpus from the Stanford Question Answering Dataset[29], that the average number of modifiers in a single non-implicative utterance is equal to 6. In such cases the number of generalizations would be equal to 64, but greater numbers of modifiers would make the parsing less tractable, considering also arguments analysis for possible substitutions. In order to limit such a phenomenon, depending on the domain, CASPAR gives the chance to limit the number of generalizations by a specific parameter which modifies the policies of selective inclusion/exclusion of modifiers categories (adjectives, adverbs or prepositions). In such a scenario, of course, the more the combinatorial possibilities, the more the number of clauses in the Clauses KB. It will appear clear, for the reader, that this approach sacrifices space for a lighter reasoning, but we rely on a three distinct points in favor of our choice: 1. An efficient indexing policy of the Clauses KB, for a fast retriving of any clause. 2. The usage of the class Sensor of Phidias for every clauses assertion, which works asyn- chronously with respect to the main production rules system, will make the agent imme- diately available after every interrogation without any latency, while additional clauses will be asserted in background. 213 Carmelo Fabio Longo et al. 201–218 3. We point to keep the Clauses KB as small as possible, in order to limit the combinatorial chances. In this paper we assume the assignment rules properly chosen among the most likely which can get the query closer to a proper candidate. As future works, a reasonable balancing between two distinct Clauses KB working on different levels might be a good solution: in the lower level (long-term memory) only clauses pertinent with the query will be searched, then put in the higher one (short-term memory) for attempting a successful reasoning. Similar approaches have been used with interesting outcomes in some of the widespread Cognitive Architectures[8]. As result evaluation, we consider a slightly rephrased KB (Colonel West) treated in [28], showing how CASPAR is able to make a successful reasoning for a question requiring a non- trivial deduction. Although this architecture is designed to work as vocal assistant, one can alike verify the reasoning by asserting manually the same belief STT asserted by the Sensor Instance as shown in Listing 2 in the Appendix. In light of this, after each assertion (lines 1, 8, 13, 18, 30) the new asserted clauses are shown, and it appears clear how the agent expands the Clauses KB considering generalizations and argument substitutions. After the query is given (line 45), is shown how the nominal Backward-Chaining algorithm is not enough for achieving a successful reasoning, while it happens using the Nested Reasoning. In Section 3.3 we have also shown how a direct command or routine can be subordinated by a clause. Although in the example (see lines 12-13 of Listing 1 in the Appendix) the production rule contains the representation of An inhabitant is at home, even a clause involving the Nested Reasoning might trigger such a rule; for instance, a simple toy scenario could include a facial recognizer among the domotic devices, which obtains information about known/unknown faces when someone is detected in the environment. Such a recognition could generate a clause representing (for instance) Robert is at home, which, combined with another clause representing Robert is an inhabitant, will produce the representation of An inhabitant is at home; the latter will trigger the production rule (related to a direct command or routine) that will turn off the alarm in the garage. This will not happen whether a thief or a domestic animal is detected, thus it provides indeed a valid example about how Beliefs KB and Clauses KB interact with each other, in a non-trivial process of deduction. 5. Conclusions and Future Work In this paper, we have presented the design of a cognitive architecture called CASPAR able to implements agents capable of both reactive and cognitive reasoning. Nevertheless we want to mark a way towards a comprehensive strategy to make deduction on Knowledge Bases whose content is parsed directly from natural language. This architecture works by using a Knowledge Base divided into two distinct parts (Beliefs KB and Clauses KB), which can also interact with each other in decision-making processes. In particular, as long as the Clauses KB increases, its cognitive features improve due to an implicit and native capability of inferring combinatorial rules from its own Knowledge Base. Thanks to the Nested Reasoning and the Clause Conceptual Generalizations, CASPAR is able to transcend the limit of the known Backward-Chaining algorithm due to the nested semantic notation; the latter is as highly descriptive as compact. Furthermore, agents based on such an architecture are able to parse 214 Carmelo Fabio Longo et al. 201–218 complex direct IoT commands and routines, letting the user customize with ease his own Smart Environment Interface and Sensors, with whatever Speech-to-Text engine. As future works, we want to test CASPAR capabilites with other languages than english and evaluate other integrations, like Abductive Reasoning and Argumentations. Even chatbots applications can take advantage of this architecture’s features. Finally, we want to exploit Phidias multiagent features by implementing standardized com- munication protocols between agents and exploit other ontologies as well. References [1] V. Këpuska, G. Bohouta, Next-generation of virtual personal assistants (Microsoft Cortana, Apple Siri, Amazon Alexa and Google Home), in: 2018 IEEE 8th Annual Computing and Communication Workshop and Conference (CCWC), 2018, pp. 99–103. [2] H. Jeon, H. R. Oh, I. Hwang, J. Kim, An Intelligent Dialogue Agent for the IoT Home, in: AAAI Workshops, 2016. URL: https://www.aaai.org/ocs/index.php/WS/AAAIW16/paper/ view/12596. [3] E. V. Polyakov, M. S. Mazhanov, A. Y. Rolich, L. S. Voskov, M. V. Kachalova, S. V. Polyakov, Investigation and development of the intelligent voice assistant for the Internet of Things using machine learning, in: 2018 Moscow Workshop on Electronic and Networking Technologies (MWENT), 2018, pp. 1–5. [4] M. Mehrabani, S. Bangalore, B. Stern, Personalized speech recognition for Internet of Things, in: 2015 IEEE 2nd World Forum on Internet of Things (WF-IoT), 2015, pp. 369–374. [5] R. Kar, R. Haldar, Applying Chatbots to the Internet of Things: Opportunities and Archi- tectural Elements, International Journal of Advanced Computer Science and Applications 7 (2016). URL: http://dx.doi.org/10.14569/IJACSA.2016.071119. doi:10.14569/IJACSA. 2016.071119. [6] C. J. Baby, F. A. Khan, J. N. Swathi, Home automation using IoT and a chatbot using natural language processing, in: 2017 Innovations in Power and Advanced Computing Technologies (i-PACT), 2017, pp. 1–6. [7] I. Kotseruba, J. K. Tsotsos, 40 years of cognitive architectures: core cognitive abilities and practical applications, Artificial Intelligence Review (2018) Rev 53, 17–94 (2020). doi:https://doi.org/10.1007/s10462-018-9646-y. [8] D. F. Lucentini, R. R. Gudwin, A comparison among cognitive architectures: A theoretical analysis, in: 2015 Annual International Conference on Biologically Inspired Cognitive Architectures, 2015. [9] T. Giulio, Consciousness as integrated information: A provisional manifesto, The Biological bulletin (2008) 215(3), 216–242. [10] S. Epstein, R. Passonneau, J. Gordon, T. Ligorio, The role of knowledge and certainty in understanding for dialogue, in: AAAI Fall Symposium Series, 2011. URL: https://www. aaai.org/ocs/index.php/FSS/FSS11/paper/view/4179. [11] V. Këpuska, G. Bohouta, Comparing speech recognition systems (microsoft api, google api and cmu sphinx), Int. Journal of Engineering Research and Application Vol. 7, Issue 3, ( Part -2) (March 2017) 20–24. 215 Carmelo Fabio Longo et al. 201–218 [12] M. Dehghani, E. Tomai, K. Forbus, M. Klenk, An integrated reasoning approach to moral decision-making, in: Proceedings of the 23rd National Conference on Artificial Intelligence - Volume 3, AAAI’08, AAAI Press, 2008, p. 1280–1286. [13] M. Scheutz, P. Schermerhorn, J. Kramer, D. Anderson, First Steps toward Natural Human- like HRI, Auton. Robots 22 (2007) 411–423. URL: https://doi.org/10.1007/s10514-006-9018-3. doi:10.1007/s10514-006-9018-3. [14] D. L. Schacter, Implicit memory: history and current status, Journal of Experimental Psychology: Learning, Memory, and Cognition vol. 13, 1987 (1987) 501–518. [15] D. Davidson, The logical form of action sentences, in: The logic of decision and action, University of Pittsburg Press, 1967, p. 81–95. [16] L. D. Consortium, Treebank-3, 2017. URL: https://catalog.ldc.upenn.edu/LDC99T42. [17] X. Huang, L. Deng, An Overview of Modern Speech Recognitiong, Microsoft Corporation, 2009, pp. 339–344. [18] R. Rajan Mehla, Mamta, Automatic speech recognition: A survey, International Journal of Advanced Research in Computer Science and Electronics Engineering Volume 3, Issue 1 (January 20147) 20–24. [19] A. D. Saliha Benkerzaz, Youssef Elmir, A study on automatic speech recognition, Journal of Information Technology Review Volume 10, Number 3 (August 2019). [20] A. S. Jinho D. Choi, Joel Tetreault, It depends: Dependency parser comparison using a web-based evaluation tool, in: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing, 2015, p. 387–396. [21] S. Anthony, J. Patrick, Dependency based logical form transformations, in: SENSEVAL-3: Third International Workshop on the Evaluation of Systems for the Semantic Analysis of Text, 2015. [22] ClearNLP, Clear nlp tagset, 2015. URL: https://github.com/clir/clearnlp-guidelines. [23] B. H. Partee, Lexical Semantics and Compositionality, volume 1, Lila R. Gleitman and Mark Liberman editors, 1995, p. 311–360. [24] C. S. Fabio D’Urso, Carmelo Fabio Longo, Programming intelligent iot systems with a python-based declarative tool, in: The Workshops of the 18th International Conference of the Italian Association for Artificial Intelligence, 2019. [25] C. F. Longo, C. Santoro, F. F. Santoro, Meaning Extraction in a Domotic Assistant Agent Interacting by means of Natural Language, in: 28th IEEE International Conference on Enabling Technologies: Infrastructure for Collaborative Enterprises, IEEE, 2019. [26] L. Fichera, F. Messina, G. Pappalardo, C. Santoro, A python framework for programming autonomous robots using a declarative approach, Sci. Comput. Program. 139 (2017) 36–55. URL: https://doi.org/10.1016/j.scico.2017.01.003. doi:10.1016/j.scico.2017.01.003. [27] G. A. Miller, Wordnet: A lexical database for english, in: Communications of the ACM Vol. 38, No. 11: 39-41, 1995. [28] P. N. Stuart J. Russel, Artificial Intelligence: A Modern Approach, Pearson, 2010. [29] stanford, The stanford question answering dataset squad2.0, 2018. URL: https://rajpurkar. github.io/SQuAD-explorer/. 216 Carmelo Fabio Longo et al. 201–218 In this appendix, a simple instance of Smart Environment Interface is provided (Listing 1) together with an example of how Clauses Knowledge Base changes, after assertions (Listing 2).  1 # Routine conditionals management 2 +SENSOR(V, X, Y) >> [check_conds()] 3 check_conds() / (SENSOR(V, X, Y) & COND(I, V, X, Y) & ROUTINE(I, K, J, L, T)) >> [-COND(I, V, X, Y), +START_ROUTINE(I), check_conds()] 4 check_conds() / SENSOR(V, X, Y) >> [-SENSOR(V, X, Y)] 5 6 # Routines execution 7 +START_ROUTINE(I) / (COND(I, V, X, Y) & ROUTINE(I, K, J, L, T)) >> [show_line("routine not ready!")] 8 +START_ROUTINE(I) / ROUTINE(I, K, J, L, T) >> [-ROUTINE(I, K, J, L, T), +INTENT(K, J, L, T), +START_ROUTINE(I)] 9 10 # turn off 11 +INTENT(X, "light", "kitchen", T) / lemma_in_syn(X, "change_state.v.01") >> [exec_cmd("change_state.v.01", "light", "kitchen", T)] 12 +INTENT(X, "alarm", "garage", T) / (lemma_in_syn(X, "change_state.v.01") & 13 eval_cls("At_IN(Be_VBP(Inhabitant_NN(x1), __), Home_NN(x2))")) >> [exec("change_state.v.01", "alarm", "garage", T)] 14 15 # any other commands 16 +INTENT(V, X, L, T) >> [show_line("Result: failed to execute the command: ", V)]   Listing 1: A simple instance of Smart Environment Interface 217 Carmelo Fabio Longo et al. 201–218  1 > +STT("Nono is an hostile nation") 2 3 Be(Nono(x1), Nation(x2)) 4 Be(Nono(x1), Hostile(Nation(x2))) 5 Nono(x) ==> Nation(x) 6 Nono(x) ==> Hostile(Nation(x)) 7 8 > +STT("Colonel West is American") 9 10 Be(Colonel_West(x1), American(x2)) 11 Colonel_West(x) ==> American(x)) 12 13 > +STT("missiles are weapons") 14 15 Be(Missile(x1), Weapon(x2)) 16 Missile(x) ==> Weapon(x) 17 18 > +STT("Colonel West sells missiles to Nono") 19 20 Sell(Colonel_West(x1), Missile(x2)) ==> Sell(American(v_0), Missile(x4)) 21 Sell(Colonel_West(x1), Missile(x2)) ==> Sell(American(x3), Weapon(v_1)) 22 Sell(Colonel_West(x1), Missile(x2)) ==> Sell(Colonel_West(x1), Weapon(v_2)) 23 Sell(Colonel_West(x1), Missile(x2)) 24 To(Sell(Colonel_West(x1), Missile(x2)), Nono(x3)) ==> To(Sell(Colonel_West(x1), Missile(x2)), Nation(v_4)) 25 To(Sell(Colonel_West(x1), Missile(x2)), Nono(x3)) ==> To(Sell(American(v_5), Missile(v_6)), Nation(v_4)) 26 To(Sell(Colonel_West(x1), Missile(x2)), Nono(x3)) ==> To(Sell(American(v_7), Weapon(v_8)), Nation(v_4)) 27 To(Sell(Colonel_West(x1), Missile(x2)), Nono(x3)) ==> To(Sell(Colonel_West(v_9), Weapon(v_10)), Nation(v_4)) 28 To(Sell(Colonel_West(x1), Missile(x2)), Nono(x3)) ==> To(Sell(Colonel_West(x1), Missile(x2)), Hostile(Nation(v_11)) 29 To(Sell(Colonel_West(x1), Missile(x2)), Nono(x3)) ==> To(Sell(American(v_12), Missile(v_13)), Hostile(Nation(v_11)) 30 To(Sell(Colonel_West(x1), Missile(x2)), Nono(x3)) ==> To(Sell(American(v_14), Weapon(v_15)), Hostile(Nation(v_11)) 31 To(Sell(Colonel_West(x1), Missile(x2)), Nono(x3)) ==> To(Sell(Colonel_West(v_16), Weapon(v_17)), Hostile(Nation(v_11)) 32 To(Sell(Colonel_West(x1), Missile(x2)), Nono(x3)) ==> To(Sell(American(v_18), Missile(v_19)), Nono(x3)) 33 To(Sell(Colonel_West(x1), Missile(x2)), Nono(x3)) ==> To(Sell(American(v_22), Weapon(v_23)), Nono(x3)) 34 To(Sell(Colonel_West(x1), Missile(x2)), Nono(x3)) ==> To(Sell(Colonel_West(v_26), Weapon(v_27)), Nono(x3)) 35 To(Sell(Colonel_West(x1), Missile(x2)), Nono(x3)) 36 37 >+STT("When an American sells weapons to a hostile nation, that American is a criminal") 38 39 To(Sell(American(x1), Weapon(x2)), Hostile(Nation(x3))) ==> Be(American(x4), Criminal(x5)) 40 41 >+STT("reason") 42 43 Waiting for query... 44 45 > +STT("Colonel West is a criminal") 46 47 Reasoning............... 48 49 Query: Be_VBZ(Colonel_West(x1), Criminal(x2)) 50 51 ---- NOMINAL REASONING --- 52 53 Result: False 54 55 ---- NESTED REASONING --- 56 57 Result: {v_211: v_121, v_212: x2, v_272: v_208, v_273: v_209, v_274: v_210, v_358: v_269, v_359: v_270, v_360: v_271}   Listing 2: CASPAR Clauses Knowledge Base changes and reasoning, after assertions 218