-

Linguistic Profiling and Behavioral Drift in Chat Bots

Nawaf Ali

ntali001@louisville.edu 0

Derek Schaeffer

dwscha02@louisville.edu 0

Roman V. Yampolskiy

roman.yampolskiy@louisville.edu 0 0 Computer Engineering and Computer, Science Department, J. B. Speed School of Engineering, University of Louisville , Louisville, KY. USA

When trying to identify the author of a book, a paper, or a letter, the object is to detect a style that distinguishes one author from another. With recent developments in artificial intelligence, chat bots sometimes play the role of the text authors. The focus of this study is to investigate the change in chat bot linguistic style over time and its effect on authorship attribution. The study shows that chat bots did show a behavioral drift in their style. Results from this study imply that any non-zero change in lingual style results in difficulty for our chat bot identification process.

Biometric identification technologies are not limited to fingerprints. Behavioral traits associated with each human provide a way to identify the person by a biometric profile. Behavioral biometrics provides an advantage over traditional biometrics in that they can be collected unbeknownst to the user under investigation (Yampolskiy & Govindaraju, 2008) . Characteristics pertaining to language, composition, and writing style, such as particular syntactic and structural layout traits, vocabulary usage and richness, unusual language usage, and stylistic traits remain relatively constant. Identifying and learning these characteristics is the primary focus of authorship authentication (Orebaugh, 2006) .

Authorship identification is a research field interested in finding traits, which can identify the original author of the document. Two main subfields of authorship identification are: (a) Authorship recognition, when there is more than one author claiming a document, and the task is to identify the correct author based on the study of style and other author-specific features. (b) Authorship verification, where the task is to verify that an author of a document is the correct author based on that author’s profile and the study of the document (Ali, Hindi & Yampolskiy, 2011) . The twelve Federalist papers claimed by both Alexander Hamilton and James Madison are an example for authorship recognition (Holmes & Forsyth, 1995) . Detecting plagiarism is a good example of the second type. Authorship verification is mostly used in forensic investigation.

When examining people, a major challenge is that the writing style of the writer might evolve and develop with time, a concept known as behavioral drift (Malyutov, 2005) . Chat bots, which are built algorithmically, have never been analyzed from this perspective. A study on identifying chat bots using Java Graphical Authorship Attribution Program (JGAAP) has shown that it is possible to identify chat bots by analyzing their chat logs for linguistics features (Ali, Hindi & Yampolskiy, 2011) . A.

Chat bots

Chat bots are computer programs mainly used in applications such as online help, e-commerce, customer services, call centers, and internet gaming (Webopedia, 2011) .

Chat bots are typically perceived as engaging software entities, which humans may communicate with, attempting to fool the human into thinking that he or she is talking to another human. Some chat bots use Natural Language Processing Systems (NLPS) when replying to a statement, while majority of other bots are scanning for keywords within the input and pull a reply with the most matching keywords (Wikipedia, 2011) .

Motivations

The ongoing threats by criminal individuals have migrated from actual physical threats and violence to another dimension, the Cyber World. Criminals try to steal others information and identity by any means. Researchers are following up and doing more work trying to prevent any criminal activities, whether it is identity theft or even terrorist threats.

II.

Application and Data Collection

Data was downloaded from the Loebner prize website (Loebner, 2012) , in which a group of human judges from different disciplines and ages are set to talk with the chat bots, and the chat bots get points depending on the quality of the conversation that the chat bot produces. A study was made on chat bot authorship with data collected in 2011 (Ali, Hindi & Yampolskiy, 2011) ; the study demonstrated the feasibility of using authorship identification techniques on chat bots. The data in the current study was collected over a period of years. Our data only pertained to chat bots that were under study in (Ali, Hindi & Yampolskiy, 2011) , which is why this study does not cover every year of the Loebner contest, which started in 1996. Only the years, that contain the chat bots under study, were used in this research.

III.

Data Preparation

The collected data had to be preprocessed by deleting unnecessary labels like the chat bot name, and time-date of conversation (Fig. 1). A Perl script was used to clean the files and split each chat into two text files, one for the chat bot under study, the other for the human judge. The judge part was ignored, and only the chat bot text was analyzed.

IV. Chat Bots used.

Eleven chat bots were used in the initial experiments: Alice (ALICE, 2011) , CleverBot (CleverBot, 2011) , Hal (HAL, 2011) , Jeeney (Jeeney, 2011) , SkyNet (SkyNet, 2011) , TalkBot (TalkBot, 2011) , Alan (Alan, 2011) , MyBot (MyBot, 2011) , Jabberwock (Jabberwock, 2011) , Jabberwacky (Jabberwacky, 2011) , and Suzette (Suzette, 2011) . These were our main baseline that we intend to Store Model Apply Model compare to the chat bots under study, which were: Alice, Jabberwacky, and Jabberwock The experiments were conducted using RapidMiner (RapidMiner, 2011) . A model was built for authorship identification that will accept the training text and create a word list and a model using the Support Vector Machine (SVM) (Fig 2), and then this word list and model will be implemented on the test text, which is, in our case, data from the Loebner prize site (Loebner, 2012) .

Process Document

Normalize

Validation Store

Word list

In Fig. 3 we use the saved word list and model as input for the testing stage, and the output will give us the percentage prediction of the tested files.

Get Word list

Process Document

Normalize

Get

Model

The data was tested using two different saved models, one with a complete set of chat bots (eleven bots) in the training stage, and the second model was built with training using only the three chat bots under study.

When performing the experiments, the model output is confidence values, in which, values reflecting how confident we are that this chat bot is identified correctly. Chat bot with highest confidence value (printed in boldface in all tables) is the predicted bot according to the model. Table 1 shows how much confidence we have in our tested data for Alice’s text files in different years, when using eleven chat bots for training.

Fig 5. Identification percentage over different years using only the three chat bots under study for training.

VI. Conclusions and Future Work

The initial experiments conducted on the collected data did show a variation between chat bots, which is expected. It is not expected that all chat bots will act the same way, since they have different creators and different algorithms.

Some chat bots are more intelligent than others; the Loebner contest aims to contrast such differences. Alice bot showed some consistency over the years under study, but in 2005 Alice’s style was not as recognizable as in other years. While Jabberwacky performed well for all years when training with just three bots and was not identified in 2001 when the training set contained all eleven chat bots for training, Jabberwacky gave us a 40% correct prediction in 2005. Jabberwock, the third chat bot under study here, was the least consistent compared to all other bots, and gave 0% correct prediction in 2001 and 2004, and 91% for 2011, which may indicate that Jabberwock’s vocabulary did improve in a way that gave him his own style.

With three chat bot training models, Jabberwacky was identified 100% correctly over all years. Alice did well for all years except for 2005, and Jabberwock was not identified at all in 2001 and 2004.

With these initial experiments, we can state that some chat bots do change their style, most probably depending on the intelligent algorithms used in initializing conversations. Other chat bots do have a steady style and do not change over time.

More data is required to get reliable results; we only managed to obtain data from the Loebner prize competition, which in some cases was just one 4KB text file. With sufficient data, results should be more representative and accurate.

Additional research on these chat bots will be conducted, and more work on trying to find specific features to identify the chat bots will be continued. This is a burgeoning research area and still much work need to be done.

Alan. ( 2011 ). AI Research . Retrieved June 10, 2011 , from http://www.ai.com/show_tree. asp?id=59&level=2&root=115

Ali , N. , Hindi , M. , & Yampolskiy , R. V. ( 2011 ). Evaluation of authorship attribution software on a Chat bot corpus . XXIII International Symposium on Information, Communication and Automation Technologies (ICAT) , Sarajevo, Bosnia and Herzegovina, 1 - 6 .

ALICE. ( 2011 ). ALICE. Retrieved June 12 , 2011 , from http://alicebot.blogspot.com/

CleverBot. ( 2011 ). CleverBot Retrieved July 5 , 2011 , from http://cleverbot.com/

HAL. ( 2011 ). AI Research . Retrieved June 16, 2011 , from http://www.ai.com/show_tree. asp?id=97&level=2&root=115

Holmes , D. I. , & Forsyth , R. S. ( 1995 ). The Federalist Revisited: New Directions in Authorship Attribution. Literary and Linguistic Computing , 10 ( 2 ), 111 - 127 .

Jabberwacky. ( 2011 ). Jabberwacky-live chat bot-AI Artificial Intelligence chatbot . Retrieved June 10 , 2011 , from http://www.jabberwacky.com/

Jabberwock. ( 2011 ). Jabberwock Chat. Retrieved June 12 , 2011 , from http://www.abenteuermedien.de/jabberwock/

Jain , A. ( 2000 ). Biometric Identification . Communications of the ACM , 43 ( 2 ), 91 - 98 .

Jain , A. , Ross , A. A. , & Nandakumar , K. ( 2011 ). Introduction to Biometrics: Springer-Verlag New York, LLC.

Jeeney. ( 2011 ). Artificial Intelligence Online. Retrieved March 11 , 2011 , from http://www.jeeney.com/

Loebner , H. G. ( 2012 ). Home Page of The Loebner Prize . Retrieved Jan 3 , 2012 , from http://loebner.net/Prizef/loebner-prize.html

Malyutov , M. B. ( 2005 ). Authorship attribution of texts: a review . Electronic Notes in Discrete Mathematics , 21 , 353 - 357 .

MyBot. ( 2011 ). Chatbot Mybot, Artificial Intelligence. Retrieved Jan 8 , 2011 , from http://www.chatbots.org/chatbot/mybot/

Orebaugh , A. ( 2006 ). An Instant Messaging Intrusion Detection System Framework: Using character frequency analysis for authorship identification and validation . 40th Annual IEEE International Carnahan Conference Security Technology, Lexington, KY.

RapidMiner. ( 2011 ). Rapid- I. Retrieved Dec 20 , 2011 , from http://rapid-i.com/

SkyNet. ( 2011 ). SkyNet - AI . Retrieved April 20 , 2011 , from http://home.comcast.net/~chatterbot/bots/AI/Sky net/

Suzette. ( 2011 ). SourceForge ChatScript Project . Retrieved Feb 7 , 2011 , from http://chatscript.sourceforge.net/

TalkBot. ( 2011 ). TalkBot- A simple talk bot . Retrieved April 14 , 2011 , from http://code.google.com/p/talkbot/

Webopedia. ( 2011 ). What is chat bot? A Word Definition from the Webpedia Computer Dictionary . Retrieved June 20, from www.webopedia.com/TERM/C/chat_bot.html

Wikipedia. ( 2011 ). Chatterbot- Wikipedia, the free encyclopedia . Retrieved June 22 , 2011 , from www.en.wikipedia.org/wiki/Chatterbot

Yampolskiy , R. V. , & Govindaraju , V. ( 2008 ). Behavioral Biometrics: a Survey and Classification . International Journal of Biometrics (IJBM) . 1 ( 1 ), 81 - 113 .