-

K. 2010. A conversational agent frame- work using semantic analysis. International Journal of In

2010

Our architecture has a Knowledge Engine that models the content semantics, a Conversation Engine that models the pragmatic semantics, and a Chat Interface that performs preprocessing and post-processing tasks (Figure 1).

The Chat Interface contains modules for receiving user input, performing stemming, detecting speech acts, detecting topic, and interfacing with the Knowledge Engine and Conversation Engine (Chakrabarti 2014).

The Knowledge Engine identifies the specific speech act for the utterance and also the specific topic being discussed. A goal-fulfillment map (O’Shea, Bandar, and Crockett 2010) specifies the content semantics for the conversation. The specific goal-fulfillment map is selected from a double-key hash table, where the keys are the topic and the speech act (Chakrabarti 2014).

The Conversation Engine models four different types of conversations, procedural conversations, informational conversations, troubleshooting conversations, and disputeresolution conversations using four different probabilistic finite state automata (Chakrabarti 2014). Figure 4. shows one such finite state machine for a troubleshooting conversation. Distinct finite state machines have been defined for other types of conversations as well.

The conversation planner maintains a workspace of four types of conversations, and increases or decreases a heuristic score for each type depending on how the conversation unfolds. A successful conversation is one in which only one type of conversation remains in the workspace, and the probabilistic finite state automaton associated with that conversation reaches a defined accepting state. A failed conversation is one in which the conversation reaches a defined unescapable dissatisfaction state for one of the probabilistic finite state automation, or all four probabilistic finite state automata associated with the four conversation types are dropped from the workspace (Chakrabarti 2014). Figure 5. shows an overview of the conversation planner.

(Chakrabarti 2014) shows in detail how the Knowledge Engine in conjunction with the Conversation Engine engineers artificial conversations. The architecture models how a human would generate a conversation and incorporates well defined ideas from conversation theory, speech act theory, and the theory of pragmatics.

Corpus and Parameter Learning

We used a corpus of chat transcripts between a human customer and a human customer service agent working for an online electronic trading portal. The corpus consisted of 2,886 distinct conversations. Each conversation was in the form of an Excel file and was clearly tagged by a unique conversation identifier. For example, in each conversation, the utterances were tagged by who was delivering it, either the customer or the customer service agent. An utterance is everything that is said by either the customer or the representative in a single turn. It consists of one for more sentences. We assume that each utterance belongs to a single context.

A series of successive utterance pairs on the same context constitutes a conversation. The shortest conversation had 5 distinct utterances. The longest conversation had 82 distinct utterances. The median was 26 utterances and the average was around 22 utterances. The utterances were mostly interleaved, i.e., alternating between the customer and the representative. Most of the conversations were related to a single context. The conversations that were not interleaved and related to more than one context were not analyzed.

We used a bag-of-words based latent-semantic algorithm to tag each utterance in each conversation in the corpus with a speech act. We also used a bag-of-words based latentsemantic algorithm to tag each conversation in the corpus with one of the topics (Chakrabarti 2014). The transition probabilities for the four finite state automaton corresponding to the 4 types of conversations were also learned from the corpus.

Generation of Artificial Conversations

This section shows in detail how an artificial conversation is generated in a step-by-step manner. 1. The conversation starts with a human making a comment.

Customer : I would like to open a new account for day trading.

What are my options? This message is entered from the standard terminal. The Utterance Bucket directly collects the text in the form of a string. A standard spellchecker and grammar checker autocorrects the spelling and grammatical errors in the sentence if any 2. The correct sentence, free of spelling and grammatical error, is sent to the Stemmer. Using Porter’s Stemming algorithm, the following stems are obtained, ”account”, ”day trade”, ”open”, and ”options”. 3. The entire stemmed sentence is then passed on simultaneously to the Speech Act Detector, the Sentiment detector, and the Topic Detector. The following events then take place. * The Speech Act Detector uses Latent Semantic Analysis to determine that the type of speech act is ”Expressive”, since the bag of words included ”would” and ”like”. * The Sentiment Detector detects that the sentiment is neutral, since none of the words from the positive or negative bag of words is encountered. * The Topic Detector determines using Latent Semantic analysis that the topic is ”new account” using bag of words ”new”, ”account”, and ”open”. 4. The output of the Speech Act Detector, the Sentiment Detector, and the Topic Detector is then sent to the interface. The Interface combines these into an array list, and sends the array list to the Conversation Engine and the Knowledge Engine simultaneously. 5. In the Knowledge Engine, the following steps take place. * The Interface of the knowledge engine receives the array list and sends it to the Speech Act Identifier. This module selects the correct speech act from the list as ”expressive”. * The interface also sends the bag of words to the topic hash table. The hash table retrieves the topic as ”new account”. The appropriate context map is then pulled out. This context map lists the steps for the encoded knowledge for opening a new account in the form of a goal-fulfillment map. The appropriate goal-fulfillment map, shown in Figure 5.17, is then put in to the workspace and sent to the interface. * A goal-fulfillment algorithm is initiated. A counter is initiated to keep track of the progression of goals in the map. 6. In the Conversation Engine, the following steps take place. * In the Probabilistic finite State Automata, initially all four possible solutions are maintained. This is because initially the probabilities of each conversation type will be nearly equal. A counter is initialized to maintain the current state of the conversation in each solution. * The Conversation Planner will calculate the probabilities of transition from one state to another depending upon the Speech Act being uttered. These transitions are learned from the corpus and are stored in a lookup table. The Conversation Planner is responsible for advancing the counter indicating the current state of the conversation. 7. The information is sent back to the Chat Interface.

The Utterance Bucket corrects spelling (unlikely) and grammatical errors, and then outputs the response of the 8. This process is repeated until the end of the conversation is indicated by the Conversation Planner counter being in an accepting state.

Conversation Creation

The next step is to actually generate the artificial conversation using the chatter bot architecture. The conversations are generated by a person, by interacting with the chatter bot architecture via a standard terminal. These are the steps to generate a conversation. 1. Play the role of the customer of the online electronic trading website. Pick out an issue from the list in 6.1.2. ”Know” the responses to all the customer-side details. For example, know that the account can have two different modes and two different trading configurations. 2. Begin a conversation with the chatter bot by typing on the standard terminal. 3. The bot will then initiate a question. It will be displayed on to the terminal window. This will almost always be ”small talk” at the beginning of the conversation. Answer the questions the bot asks by typing back into the terminal window. 4. The conversation will be lead by the bot, i.e., - the bot will either ask the question to which the customer will respond (when did you put in the buy order?), or - the bot will instruct the customer to perform some action (change the configuration of the account) to which the customer will answer affirmatively that he / she has completed the action, or answer negatively that he / she is unable to perform the action with a qualifier (I am unable to access the reset password form. I do not have my customer relationship number.) or - The bot will ask a question that will require a Yes or

No answer. 5. The responses of the customer has to be an exact match with the expected answer in the goal fulfillment map, irrespective of the response that the customer choses. For example, in response to a query from the bot: ”Do you remember what kind of orders you placed?” - The customer can either answer negatively ”No, I do not remember” or - The customer can answer ”Yes, they were buy orders” or ”Buy orders” or ”Yes, buy orders” - The customer can answer ”Yes, they were sell orders” or ”Sells orders” or ”Yes, sell orders” But the customer cannot answer ”Very unlikely they were buy orders, but I am not really sure”. This is because sentence similarity hasn’t been implemented in this architecture. Sentence similarity is the area of research that reduces a range of semantically similar sentences into a root sentence (O’Shea et al. 2004; O’Shea, Bandar, and Crockett 2009). Hence for this dissertation, the responses need to have the exact words with only a slight change in grammar. 6. The transcript of the conversation is written to a file, and is tagged with the customer utterance and bot utterance. These transcripts can then be analyzed.

Results, Conclusions, and Future Directions

We generated 48 artificial conversations using this technique. Out of these, 42 conversations reached a conclusion state, and 6 conversations failed. Thus, we had a success rate of 87.5%. The transcripts of all 48 conversations are available at www.cs.unm.edu/˜cc/ artificial_conversations/transcripts/.

We demonstrate a modular, robust, and scalable architecture for chatter bots. The specific concepts of pragmatics, speech acts, and dialogue acts are well known in the field of conversation theory. However, this research is the first example of computationally modeling these specific concepts to realize pragmatic semantics for chatter bots. Similarly, specific concepts like goal-fulfillment maps have been explored previously in the knowledge representation literature. But this work is the first example of using goal-fulfillment maps for modeling content semantics for chatter bots in the form of a series of sub-contexts. In addition, this work is the first example of combining pragmatic semantics and content semantics to generate artificial conversations.

There are several exciting directions in which this work might be extended. Incorporating richer knowledge representation and retrieval techniques, such as ontologies, might make the architecture work with even less situation specific contextual conversations. We considered only four types of conversations, i.e., Procedural, Informational, Troubleshooting, and Dispute Resolution. Other types of conversations can be defined and the modeling and analysis can be extended to these types.The conversations were modeled using stochastic finite state automata, which worked well in narrow situational contexts.

References

[Bobrow et al. 1977] Bobrow, D. G.; Kaplan, R. M.; Kay, M.; Norman, D. A.; Thompson, H.; and Winograd, T. 1977. Gus: A frame-driven dia—og system. Artificial Intelligence 8:155–173. [Chakrabarti and Luger 2012] Chakrabarti, C., and Luger, G. 2012. A semantic architecture for artificial conversations. In The 13th International Symposium on Advanced Intelligent Systems. Kobe, Japan: IEEE Press. [Chakrabarti and Luger 2013] Chakrabarti, C., and Luger, G. 2013. A framework for simulating and evaluating artificial chatter bot conversations. In The 26th International Florida Artificial Intelligence Research Society Conference. St. Pete Beach, FL: AAAI Press. [Chakrabarti 2014] Chakrabarti, C. 2014. Artificial Conversations for Chatter Bots Using Knowledge Representation, Learning, and Pragmatics. Ph.D. Dissertation, University of New Mexico, Albuquerque, NM. [Filisko and Seneff 2003] Filisko, E., and Seneff, S. 2003. A context resolution server for the galaxy conversational systems. In Proc. Eurospeech. [Gasic et al. 2013] Gasic, M.; Breslin, C.; Henderson, M.; Kim, D.; Szummer, M.; Thomson, B.; Tsiakoulis, P.; and Young, S. 2013. Pomdp-based dialogue manager adaptation to extended domains. In SigDial Metz France. [Ginzburg 2008] Ginzburg, J. 2008. Semantics for Conversation. King’s College, London: CSLI Publications. [Henderson, Thomson, and Young 2013] Henderson, M.; Thomson, B.; and Young, S. 2013. Deep neural network approach for the dialog state tracking challenge. In SigDial Metz FranceMetz France. [Levin et al. 2000] Levin, E.; Narayanan, S.; Pieraccini, R.; Biatov, K.; Bocchieri, E.; Fabbrizio, G. D.; Eckert, W.; Lee, S.; Pokrovsky, A.; Rahim, M.; Ruscitti, P.; and Walker, M. 2000. The att-darpa communicator mixed-initiative spoken dialog system. In ICSLP. [Metallinou et al. 2013] Metallinou, A.; Bohus, D.; ; and Williams, J. D. 2013. Discriminative state tracking for spoken dialog systems. In Proceedings of Annual Meeting of the Association for Computational Linguistics (ACL), Sofia, Bulgaria. [O’Shea, Bandar, and Crockett 2009] O’Shea, K.; Bandar, Z.; and Crockett, K. 2009. A semantic- based conversational agent framework. In The 4th International Conference for Internet Technology and Secured Transactions (ICITST2009), Technical Co- Sponsored by IEEE UK?RI Communications Chapter, 92–99. [O’Shea et al. 2004] O’Shea, K.; Bandar, Z.; Crockett, K.; and Mclean, D. 2004. A comparative study of two short text semantic similarity measures. Lecture Notes on Artificial Intelligence 4953:172. [Rieser and Lemon 2013] Rieser, V., and Lemon, O. 2013. Reinforcement Learning for Adaptive Dialogue Systems: A Data-driven Methodology for Dialogue Management and Natural Language Generation. Springer. [Seneff et al. 1998] Seneff, S.; Hurley, E.; Lau, R.; Pao, C.; Schmid, P.; and Zue, V. 1998. Galaxy-ii: A reference architecture for conversational system development. In Proc. ICSLP, Sydney, Australia.

[Polifroni and Seneff 2000 ] Polifroni, J. , and Seneff , S. 2000 . Galaxy-ii as an architecture for spoken dialogue evaluation . In LREC.