=Paper=
{{Paper
|id=None
|storemode=property
|title=Linguistic Profiling and Behavioral Drift in Chat Bots
|pdfUrl=https://ceur-ws.org/Vol-841/submission_22.pdf
|volume=Vol-841
|dblpUrl=https://dblp.org/rec/conf/maics/AliSY12
}}
==Linguistic Profiling and Behavioral Drift in Chat Bots==
Linguistic Profiling and Behavioral Drift in Chat Bots
Nawaf Ali Derek Schaeffer Roman V. Yampolskiy
Computer Engineering and Computer Computer Engineering and Computer Computer Engineering and Computer
Science Department Science Department Science Department
J. B. Speed School of Engineering J. B. Speed School of Engineering J. B. Speed School of Engineering
University of Louisville University of Louisville University of Louisville
Louisville, KY. USA Louisville, KY. USA Louisville, KY. USA
ntali001@louisville.edu dwscha02@louisville.edu roman.yampolskiy@louisville.edu
Abstract identification are: (a) Authorship recognition, when there
is more than one author claiming a document, and the task
When trying to identify the author of a book, a paper, or a
letter, the object is to detect a style that distinguishes one is to identify the correct author based on the study of style
author from another. With recent developments in and other author-specific features. (b) Authorship
artificial intelligence, chat bots sometimes play the role of verification, where the task is to verify that an author of a
the text authors. The focus of this study is to investigate document is the correct author based on that author’s
the change in chat bot linguistic style over time and its profile and the study of the document (Ali, Hindi &
effect on authorship attribution. The study shows that chat Yampolskiy, 2011). The twelve Federalist papers claimed
bots did show a behavioral drift in their style. Results by both Alexander Hamilton and James Madison are an
from this study imply that any non-zero change in lingual example for authorship recognition (Holmes & Forsyth,
style results in difficulty for our chat bot identification
1995). Detecting plagiarism is a good example of the
process.
second type. Authorship verification is mostly used in
forensic investigation.
I. Introduction
When examining people, a major challenge is that the
Biometric identification is a way to discover or verify the writing style of the writer might evolve and develop with
identity of who we claim to be by using physiological and time, a concept known as behavioral drift (Malyutov,
behavioral traits (Jain, 2000). To serve as an identifier, a 2005). Chat bots, which are built algorithmically, have
biometric should have the following properties: (a) never been analyzed from this perspective. A study on
Universality, which means that a characteristic should identifying chat bots using Java Graphical Authorship
apply to everybody, (b) uniqueness, the characteristics will Attribution Program (JGAAP) has shown that it is possible
be unique to each individual being studied, (c) to identify chat bots by analyzing their chat logs for
permanence, the characteristics should not change over linguistics features (Ali, Hindi & Yampolskiy, 2011).
time in a way that will obscure the identity of a person, and
(d) collectability, the ability to measure such A. Chat bots
characteristics (Jain, Ross & Nandakumar, 2011). Chat bots are computer programs mainly used in
Biometric identification technologies are not limited applications such as online help, e-commerce, customer
to fingerprints. Behavioral traits associated with each services, call centers, and internet gaming (Webopedia,
human provide a way to identify the person by a biometric 2011).
profile. Behavioral biometrics provides an advantage over Chat bots are typically perceived as engaging software
traditional biometrics in that they can be collected entities, which humans may communicate with, attempting
unbeknownst to the user under investigation (Yampolskiy to fool the human into thinking that he or she is talking to
& Govindaraju, 2008). Characteristics pertaining to another human. Some chat bots use Natural Language
language, composition, and writing style, such as Processing Systems (NLPS) when replying to a statement,
particular syntactic and structural layout traits, vocabulary while majority of other bots are scanning for keywords
usage and richness, unusual language usage, and stylistic within the input and pull a reply with the most matching
traits remain relatively constant. Identifying and learning keywords (Wikipedia, 2011).
these characteristics is the primary focus of authorship
authentication (Orebaugh, 2006). B. Motivations
The ongoing threats by criminal individuals have migrated
Authorship identification is a research field interested from actual physical threats and violence to another
in finding traits, which can identify the original author of dimension, the Cyber World. Criminals try to steal others
the document. Two main subfields of authorship information and identity by any means. Researchers are
following up and doing more work trying to prevent any compare to the chat bots under study, which were: Alice,
criminal activities, whether it is identity theft or even Jabberwacky, and Jabberwock
terrorist threats.
V. Experiments
II. Application and Data Collection The experiments were conducted using RapidMiner
Data was downloaded from the Loebner prize website (RapidMiner, 2011). A model was built for authorship
(Loebner, 2012), in which a group of human judges from identification that will accept the training text and create a
different disciplines and ages are set to talk with the chat word list and a model using the Support Vector Machine
bots, and the chat bots get points depending on the quality (SVM) (Fig 2), and then this word list and model will be
of the conversation that the chat bot produces. A study implemented on the test text, which is, in our case, data
was made on chat bot authorship with data collected in from the Loebner prize site (Loebner, 2012).
2011 (Ali, Hindi & Yampolskiy, 2011); the study
Process Normalize Validation Store
demonstrated the feasibility of using authorship Document Model
identification techniques on chat bots. The data in the
current study was collected over a period of years. Our
data only pertained to chat bots that were under study in
Store
(Ali, Hindi & Yampolskiy, 2011), which is why this study Word list
does not cover every year of the Loebner contest, which
started in 1996. Only the years, that contain the chat bots Fig. 2. Training model using Rapid Miner.
under study, were used in this research.
In Fig. 3 we use the saved word list and model as
input for the testing stage, and the output will give us the
III. Data Preparation percentage prediction of the tested files.
The collected data had to be preprocessed by deleting
unnecessary labels like the chat bot name, and time-date of Get Word Process Normalize Apply
conversation (Fig. 1). A Perl script was used to clean the list Document Model
files and split each chat into two text files, one for the chat
bot under study, the other for the human judge. The judge
Get
part was ignored, and only the chat bot text was analyzed. Model
Fig. 3. Testing stage using Rapid Miner.
The data was tested using two different saved models,
one with a complete set of chat bots (eleven bots) in the
training stage, and the second model was built with
training using only the three chat bots under study.
When performing the experiments, the model output
is confidence values, in which, values reflecting how
confident we are that this chat bot is identified correctly.
Chat bot with highest confidence value (printed in
boldface in all tables) is the predicted bot according to the
model. Table 1 shows how much confidence we have in
our tested data for Alice’s text files in different years,
when using eleven chat bots for training.
Fig. 1. Sample conversation between a chat bot and a judge.
Table 1. Confidence level of Alice’s files when tested with all eleven
chat bots used in training
IV. Chat Bots used.
Eleven chat bots were used in the initial experiments:
Alice (ALICE, 2011), CleverBot (CleverBot, 2011), Hal
(HAL, 2011), Jeeney (Jeeney, 2011), SkyNet (SkyNet,
2011), TalkBot (TalkBot, 2011), Alan (Alan, 2011),
MyBot (MyBot, 2011), Jabberwock (Jabberwock, 2011),
Jabberwacky (Jabberwacky, 2011), and Suzette (Suzette,
2011). These were our main baseline that we intend to
Table 2 shows the confidence level of Alice’s files
when using only the three chat bots under study. Table 3 shows the confidence level of Jabberwacky’s
files values when tested with the complete set of eleven
Table 2. Confidence level of Alice’s files when tested with only three chat bots.
chat bots used in training.
Table 3. Confidence level of Jabberwacky’s files when tested with all
11 chat bots used in training.
Fig. 4 shows the results of testing the three chat bots
over different years when training our model using all
eleven chat bots.
The results in Fig. 5 comes from the experiments that
uses a training set based on the three chat bots under
study, Alice, Jabberwacky, and Jabberwock. Jabberwock
did not take part in the 2005 contest.
Table 4 shows the confidence level of Jabberwock’s
files when all the chat bots are used for training.
Table 4. Confidence level of Jabberwock’s files when tested with all
eleven chat bots used in training.
Fig. 4. Identification percentage over different years using all eleven chat
bots for training. VI. Conclusions and Future Work
The initial experiments conducted on the collected data
did show a variation between chat bots, which is expected.
It is not expected that all chat bots will act the same way,
since they have different creators and different algorithms.
Some chat bots are more intelligent than others; the
Loebner contest aims to contrast such differences. Alice
bot showed some consistency over the years under study,
but in 2005 Alice’s style was not as recognizable as in
other years. While Jabberwacky performed well for all
years when training with just three bots and was not
identified in 2001 when the training set contained all
eleven chat bots for training, Jabberwacky gave us a 40%
correct prediction in 2005. Jabberwock, the third chat bot
under study here, was the least consistent compared to all
other bots, and gave 0% correct prediction in 2001 and
Fig 5. Identification percentage over different years using only the three 2004, and 91% for 2011, which may indicate that
chat bots under study for training. Jabberwock’s vocabulary did improve in a way that gave
him his own style.
Jeeney. (2011). Artificial Intelligence Online. Retrieved
With three chat bot training models, Jabberwacky March 11, 2011, from http://www.jeeney.com/
was identified 100% correctly over all years. Alice did Loebner, H. G. (2012). Home Page of The Loebner Prize.
well for all years except for 2005, and Jabberwock was Retrieved Jan 3, 2012, from
not identified at all in 2001 and 2004. http://loebner.net/Prizef/loebner-prize.html
Malyutov, M. B. (2005). Authorship attribution of texts: a
With these initial experiments, we can state that some review. Electronic Notes in Discrete
chat bots do change their style, most probably depending Mathematics, 21, 353-357.
on the intelligent algorithms used in initializing MyBot. (2011). Chatbot Mybot, Artificial Intelligence.
conversations. Other chat bots do have a steady style and Retrieved Jan 8, 2011, from
do not change over time. http://www.chatbots.org/chatbot/mybot/
Orebaugh, A. (2006). An Instant Messaging Intrusion
More data is required to get reliable results; we only Detection System Framework: Using character
managed to obtain data from the Loebner prize frequency analysis for authorship identification
competition, which in some cases was just one 4KB text and validation. 40th Annual IEEE International
file. With sufficient data, results should be more Carnahan Conference Security Technology,
representative and accurate. Lexington, KY.
RapidMiner. (2011). Rapid- I. Retrieved Dec 20, 2011,
Additional research on these chat bots will be from http://rapid-i.com/
conducted, and more work on trying to find specific SkyNet. (2011). SkyNet - AI. Retrieved April 20, 2011,
features to identify the chat bots will be continued. This from
is a burgeoning research area and still much work need to http://home.comcast.net/~chatterbot/bots/AI/Sky
be done. net/
Suzette. (2011). SourceForge ChatScript Project.
References Retrieved Feb 7, 2011, from
http://chatscript.sourceforge.net/
TalkBot. (2011). TalkBot- A simple talk bot. Retrieved
Alan. (2011). AI Research. Retrieved June 10, 2011, from April 14, 2011, from
http://www.a- http://code.google.com/p/talkbot/
i.com/show_tree.asp?id=59&level=2&root=115 Webopedia. (2011). What is chat bot? A Word Definition
Ali, N., Hindi, M., & Yampolskiy, R. V. (2011). from the Webpedia Computer Dictionary.
Evaluation of authorship attribution software on Retrieved June 20, from
a Chat bot corpus. XXIII International www.webopedia.com/TERM/C/chat_bot.html
Symposium on Information, Communication and Wikipedia. (2011). Chatterbot- Wikipedia, the free
Automation Technologies (ICAT), Sarajevo, encyclopedia. Retrieved June 22, 2011, from
Bosnia and Herzegovina, 1-6. www.en.wikipedia.org/wiki/Chatterbot
ALICE. (2011). ALICE. Retrieved June 12, 2011, from Yampolskiy, R. V., & Govindaraju, V. (2008).
http://alicebot.blogspot.com/ Behavioral Biometrics: a Survey and
CleverBot. (2011). CleverBot Retrieved July 5, 2011, Classification. International Journal of
from http://cleverbot.com/ Biometrics (IJBM). 1(1), 81-113.
HAL. (2011). AI Research. Retrieved June 16, 2011, from
http://www.a-
i.com/show_tree.asp?id=97&level=2&root=115
Holmes, D. I., & Forsyth, R. S. (1995). The Federalist
Revisited: New Directions in Authorship
Attribution. Literary and Linguistic Computing,
10(2), 111-127.
Jabberwacky. (2011). Jabberwacky-live chat bot-AI
Artificial Intelligence chatbot. Retrieved June 10,
2011, from http://www.jabberwacky.com/
Jabberwock. (2011). Jabberwock Chat. Retrieved June 12,
2011, from
http://www.abenteuermedien.de/jabberwock/
Jain, A. (2000). Biometric Identification. Communications
of the ACM, 43(2), 91-98.
Jain, A., Ross, A. A., & Nandakumar, K. (2011).
Introduction to Biometrics: Springer-Verlag New
York, LLC.