Linguistic Profiling and Behavioral Drift in Chat Bots

Linguistic Profiling and Behavioral Drift in Chat Bots NawafAli Computer Engineering and Computer Science Department J. B. Speed School of Engineering University of Louisville Louisville

KY USA

DerekSchaeffer Computer Engineering and Computer Science Department J. B. Speed School of Engineering University of Louisville Louisville

KY USA

RomanVYampolskiy roman.yampolskiy@louisville.edu Computer Engineering and Computer Science Department J. B. Speed School of Engineering University of Louisville Louisville

KY USA

Linguistic Profiling and Behavioral Drift in Chat Bots 5810B8904BAF137010A38DEEEC3FA37D GROBID - A machine learning software for extracting information from scholarly documents

When trying to identify the author of a book, a paper, or a letter, the object is to detect a style that distinguishes one author from another. With recent developments in artificial intelligence, chat bots sometimes play the role of the text authors. The focus of this study is to investigate the change in chat bot linguistic style over time and its effect on authorship attribution. The study shows that chat bots did show a behavioral drift in their style. Results from this study imply that any non-zero change in lingual style results in difficulty for our chat bot identification process.

I. Introduction

Biometric identification is a way to discover or verify the identity of who we claim to be by using physiological and behavioral traits (Jain, 2000). To serve as an identifier, a biometric should have the following properties: (a) Universality, which means that a characteristic should apply to everybody, (b) uniqueness, the characteristics will be unique to each individual being studied, (c) permanence, the characteristics should not change over time in a way that will obscure the identity of a person, and (d) collectability, the ability to measure such characteristics (Jain, Ross & Nandakumar, 2011). Biometric identification technologies are not limited to fingerprints. Behavioral traits associated with each human provide a way to identify the person by a biometric profile. Behavioral biometrics provides an advantage over traditional biometrics in that they can be collected unbeknownst to the user under investigation (Yampolskiy & Govindaraju, 2008). Characteristics pertaining to language, composition, and writing style, such as particular syntactic and structural layout traits, vocabulary usage and richness, unusual language usage, and stylistic traits remain relatively constant. Identifying and learning these characteristics is the primary focus of authorship authentication (Orebaugh, 2006).

Authorship identification is a research field interested in finding traits, which can identify the original author of the document.

Two main subfields of authorship identification are: (a) Authorship recognition, when there is more than one author claiming a document, and the task is to identify the correct author based on the study of style and other author-specific features.

(b) Authorship verification, where the task is to verify that an author of a document is the correct author based on that author's profile and the study of the document (Ali, Hindi & Yampolskiy, 2011). The twelve Federalist papers claimed by both Alexander Hamilton and James Madison are an example for authorship recognition (Holmes & Forsyth, 1995). Detecting plagiarism is a good example of the second type. Authorship verification is mostly used in forensic investigation.

When examining people, a major challenge is that the writing style of the writer might evolve and develop with time, a concept known as behavioral drift (Malyutov, 2005). Chat bots, which are built algorithmically, have never been analyzed from this perspective. A study on identifying chat bots using Java Graphical Authorship Attribution Program (JGAAP) has shown that it is possible to identify chat bots by analyzing their chat logs for linguistics features (Ali, Hindi & Yampolskiy, 2011).

A. Chat bots

Chat bots are computer programs mainly used in applications such as online help, e-commerce, customer services, call centers, and internet gaming (Webopedia, 2011).

Chat bots are typically perceived as engaging software entities, which humans may communicate with, attempting to fool the human into thinking that he or she is talking to another human. Some chat bots use Natural Language Processing Systems (NLPS) when replying to a statement, while majority of other bots are scanning for keywords within the input and pull a reply with the most matching keywords (Wikipedia, 2011).

B. Motivations

The ongoing threats by criminal individuals have migrated from actual physical threats and violence to another dimension, the Cyber World. Criminals try to steal others information and identity by any means. Researchers are following up and doing more work trying to prevent any criminal activities, whether it is identity theft or even terrorist threats.

II. Application and Data Collection

Data was downloaded from the Loebner prize website (Loebner, 2012), in which a group of human judges from different disciplines and ages are set to talk with the chat bots, and the chat bots get points depending on the quality of the conversation that the chat bot produces. A study was made on chat bot authorship with data collected in 2011 (Ali, Hindi & Yampolskiy, 2011); the study demonstrated the feasibility of using authorship identification techniques on chat bots. The data in the current study was collected over a period of years. Our data only pertained to chat bots that were under study in (Ali, Hindi & Yampolskiy, 2011), which is why this study does not cover every year of the Loebner contest, which started in 1996. Only the years, that contain the chat bots under study, were used in this research.

III. Data Preparation

The collected data had to be preprocessed by deleting unnecessary labels like the chat bot name, and time-date of conversation (Fig. 1). A Perl script was used to clean the files and split each chat into two text files, one for the chat bot under study, the other for the human judge. The judge part was ignored, and only the chat bot text was analyzed.

IV. Chat Bots used.

Eleven chat bots were used in the initial experiments: Alice (ALICE, 2011), CleverBot (CleverBot, 2011), Hal (HAL, 2011), Jeeney (Jeeney, 2011), SkyNet (SkyNet, 2011), TalkBot (TalkBot, 2011), Alan (Alan, 2011), MyBot (MyBot, 2011), Jabberwock (Jabberwock, 2011), Jabberwacky (Jabberwacky, 2011), and Suzette (Suzette, 2011). These were our main baseline that we intend to compare to the chat bots under study, which were: Alice, Jabberwacky, and Jabberwock

V. Experiments

The experiments were conducted using RapidMiner (RapidMiner, 2011). A model was built for authorship identification that will accept the training text and create a word list and a model using the Support Vector Machine (SVM) (Fig 2 ), and then this word list and model will be implemented on the test text, which is, in our case, data from the Loebner prize site (Loebner, 2012). In Fig. 3 we use the saved word list and model as input for the testing stage, and the output will give us the percentage prediction of the tested files. The data was tested using two different saved models, one with a complete set of chat bots (eleven bots) in the training stage, and the second model was built with training using only the three chat bots under study.

When performing the experiments, the model output is confidence values, in which, values reflecting how confident we are that this chat bot is identified correctly. Chat bot with highest confidence value (printed in boldface in all tables) is the predicted bot according to the model. Table 1 shows how much confidence we have in our tested data for Alice's text files in different years, when using eleven chat bots for training. Table 2 shows the confidence level of Alice's files when using only the three chat bots under study. Fig. 4 shows the results of testing the three chat bots over different years when training our model using all eleven chat bots.

The results in Fig. 5 comes from the experiments that uses a training set based on the three chat bots under study, Alice, Jabberwacky, and Jabberwock. Jabberwock did not take part in the 2005 contest. Table 3 shows the confidence level of Jabberwacky's files values when tested with the complete set of eleven chat bots. Table 4 shows the confidence level of Jabberwock's files when all the chat bots are used for training.

VI. Conclusions and Future Work

The initial experiments conducted on the collected data did show a variation between chat bots, which is expected. It is not expected that all chat bots will act the same way, since they have different creators and different algorithms.

Some chat bots are more intelligent than others; the Loebner contest aims to contrast such differences. Alice bot showed some consistency over the years under study, but in 2005 Alice's style was not as recognizable as in other years. While Jabberwacky performed well for all years when training with just three bots and was not identified in 2001 when the training set contained all eleven chat bots for training, Jabberwacky gave us a 40% correct prediction in 2005. Jabberwock, the third chat bot under study here, was the least consistent compared to all other bots, and gave 0% correct prediction in 2001 and 2004, and 91% for 2011, which may indicate that Jabberwock's vocabulary did improve in a way that gave him his own style.

With three chat bot training models, Jabberwacky was identified 100% correctly over all years. Alice did well for all years except for 2005, and Jabberwock was not identified at all in 2001 and 2004.

With these initial experiments, we can state that some chat bots do change their style, most probably depending on the intelligent algorithms used in initializing conversations. Other chat bots do have a steady style and do not change over time.

More data is required to get reliable results; we only managed to obtain data from the Loebner prize competition, which in some cases was just one 4KB text file. With sufficient data, results should be more representative and accurate.

Additional research on these chat bots will be conducted, and more work on trying to find specific features to identify the chat bots will be continued. This is a burgeoning research area and still much work need to be done.

Fig. 1 .1Fig. 1. Sample conversation between a chat bot and a judge.

Fig. 2 .2Fig. 2. Training model using Rapid Miner.

Fig. 3 .3Fig. 3. Testing stage using Rapid Miner.

Fig. 4 .4Fig. 4. Identification percentage over different years using all eleven chat bots for training.

Table 1 .1Confidence level of Alice's files when tested with all eleven chat bots used in trainingProcess

Document Normalize Validation Store Word list Store Model Get Word list Process Document Normalize Get Model Apply Model

Table 2 .2Confidence level of Alice's files when tested with only three chat bots used in training.

Table 3 .3Confidence level of Jabberwacky's files when tested with all 11 chat bots used in training.

Table 4 .4Confidence level of Jabberwock's files when tested with all eleven chat bots used in training.

AI Research Alan 2011. June 10, 2011 <author> <persName><forename type="first">N</forename><surname>Ali</surname></persName> </author> <author> <persName><forename type="first">M</forename><surname>Hindi</surname></persName> </author> <author> <persName><forename type="first">R</forename><forename type="middle">V</forename><surname>Yampolskiy</surname></persName> </author> <imprint> <date type="published" when="2011">2011</date> </imprint> </monogr> </biblStruct> <biblStruct xml:id="b2"> <analytic> <title level="a" type="main">Evaluation of authorship attribution software on a Chat bot corpus XXIII International Symposium on Information, Communication and Automation Technologies (ICAT)

Sarajevo, Bosnia and Herzegovina

ALICE Alice 2011. June 12, 2011 CleverBot Retrieved Cleverbot 2011. July 5, 2011. 2011. June 16, 2011 AI Research The Federalist Revisited: New Directions in Authorship Attribution DIHolmes RSForsyth Literary and Linguistic Computing 10 2 1995 Jabberwacky-live chat bot-AI Artificial Intelligence chatbot Jabberwacky Communications of the ACM 2011. June 10, 2011. 2011. June 12, 2011. 2000 43 Biometric Identification <author> <persName><forename type="first">A</forename><surname>Jain</surname></persName> </author> <author> <persName><forename type="first">A</forename><forename type="middle">A</forename><surname>Ross</surname></persName> </author> <author> <persName><forename type="first">K</forename><surname>Nandakumar</surname></persName> </author> <imprint> <date type="published" when="2011">2011</date> </imprint> </monogr> </biblStruct> <biblStruct xml:id="b8"> <monogr> <ptr target="http://www.jeeney.com/" /> <title level="m">Artificial Intelligence Online Springer-Verlag New York, LLC. Jeeney 2011. March 11, 2011 Introduction to Biometrics Authorship attribution of texts: a review HGLoebner Malyutov Electronic Notes in Discrete Mathematics 2012. Jan 3, 2012. 2005 21 Home Page of The Loebner Prize An Instant Messaging Intrusion Detection System Framework: Using character frequency analysis for authorship identification and validation Mybot Orebaugh 40th Annual IEEE International Carnahan Conference Security Technology

Lexington, KY

2011. Jan 8, 2011. 2006 Chatbot Mybot, Artificial Intelligence Rapid-I Rapidminer 2011. Dec 20, 2011 SkyNet -AI Skynet Chatterbot-Wikipedia, the free encyclopedia RVGovindaraju 2011. April 20, 2011. 2011. Feb 7, 2011. 2011. April 14, 2011. 2011. June 20. 2011. June 22, 2011. 2008 TalkBot-A simple talk bot. What is chat bot? A Word Definition from the Webpedia Computer Dictionary Behavioral Biometrics: a Survey and Classification International Journal of Biometrics (IJBM) 1 1