Contextual Binding and Deception Detection Jim Q. Chen, Ph.D. National Defense University, U.S.A. Abstract sentence within that information structure in a specified Deception is frequently used in cyber attacks. Detecting de- way”. All these researches scrutinize contexts from varied ception is always a challenge, as witnessed in attacks in so- perspectives. They all show the significance of contexts in cial media and other online environments. Contexts can help nailing down meaning or interpretation. In the same spirit, to identify deception. Unfortunately, there is not much liter- this paper proposes the Contextual Binding Conditions and ature available in this aspect. This paper explores the unique properties of contextual binding. It examines roles that it the Detection Condition utilizing contextual operators, on plays. It also proposes a novel approach in detecting decep- the basis of linguistic samples. These conditions are then tion utilizing contextual binding in the cyber domain. applied to both language disambiguation and deception detection in the cyber domain. The success of the applica- tion confirms the validity of these conditions. Introduction A context is defined in Webster dictionary as “the interre- lated conditions in which something exists or occurs” or as Challenges “the parts of a discourse that surround a word or passage As discussed previously, contexts are used in linguistic and can throw light on its meaning”. (http://www.merriam- analysis, in building intelligent agents, and in other artifi- webster.com/dictionary/context) It is also defined in cial intelligence fields. However, no formal methods that Brézillon (1999) and Brézillon (2002) as “a collection of employ contexts or contextual operators have ever been relevant conditions and surrounding influences that make a used in detecting deception in the cyber domain. This pa- situation unique and comprehensible”. In this sense, a con- per intends to explore this possibility. text helps to disambiguate meaning and find out the actual Before moving on, let us be aware of two types of de- referent. Hence, it is essential in data mining and big data ception. analytics. There are various approaches in context analysis. Caddell (2004) defines two types of deception. They are Brézillon (2003) uses contextual graphs to address the dy- fabrication and manipulation. He states, “If false infor- namic of context. Gaifman (2008) uses syntactically repre- mation is created and presented as true, this is fabrication.” sented context operators in the analysis of contextuality. “Manipulation, on the other hand, is the use of information Grossi, Dignum, and Meyer (2006) propose a notion of which is technically true, but is being presented out of con- contextual terminology to “reason within contexts (intra- text in order to create a false implication.” Almeshekah and contextual reasoning)” and to “reason also about contexts Spafford (2014) also state: “Deception always involves and their interplay (inter-contextual reasoning)”. Rebuschi two basic steps, hiding the real and showing the false”. and Lihoreau (2009) address “the connections between How can deception, specifically fabrication and manipu- knowledge and context” with the contextual epistemic log- lation, be detected in the cyber domain? This is one of the ic. McCarthy (1993) discusses “formalizing contexts as challenges we are facing. Even though there is not much first class objects”. From Linguistic perspective, Key literature available in this aspect, researches in the similar (1989) mentions contextual operators, which, in his term, fields may be looked at. Burgoon, Blair, Qin, and “are lexical items or grammatical constructions whose se- Nunamaker (2003) use decision trees in detecting decep- mantic value consists, at least in part, of instructions to find tion within online chat messages. They utilize “16 linguis- in, or impute to, the context a certain kind of information tic features that can be automated to return assessments of structure and to locate the information presented by the the likely truthful or deceptiveness of a piece of text”. Copyright held by the author. All rights reserved. They find out that “deceivers do utilize language different- sentence. To a certain extent, pieces of information provid- ly than truth tellers”. However, it has to be pointed out the ed previously can serve as contextual components for enti- 16 linguistic features are pre-defined so that deception em- ties in the same sentence or following sentences. In this ploying features other than these 16 features may not be particular case, the agent of the action is the noun phrase detected. “John” in the subject position; the patient of the action is Zhou, Shi, and Zhang (2008) develop the Statistical the noun phrase “a book” in the object position; and the Language Models (SLMs), which consider “all of the predicate, i.e. the action, is “purchasing and reading x, words in a text as potential features without relying on the which is a book in this case”. Now, the noun phrase “the extraction of a predefined set of cues to deception”. Word book” plays the role of a contextual operator, which be- dependencies are learned to “capture semantic relation- comes available for the interpretation of the pronoun “it”, ships and dependency relationships among words so as to as it satisfies the condition of being a singular non-human approximate the meaning of sentences, which can benefit entity, just like the pronoun “it”, in this particular context. deception detection”. This method is better than the meth- Based on the observation of the contextual relationship, it od used in Burgoon, Blair, Qin, and Nunamaker (2003). may be claimed that the pronoun “it” is contextually bound However, it is relatively time-consuming, as “SLMs con- by the contextual operator of “what”, namely, “a book” in sider all possible n-grams as features and implicitly repre- this particular case. In this contextual binding case, if the sent the importance of those features according to their pronoun “it” refers to another entity, such as “a map”, ra- contribution to the quality of language modelling”. In addi- tion, this approach does not address the detection of fabri- ther than the entity “a book”, the interpretation is immedi- cation or manipulation as it is not designed for that pur- ately recognized as being abnormal or regarded as being pose. unacceptable. This clearly reveals a contextual binding To address all these challenges and to figure out a holis- relationship, which can be defined as follows: tic and dynamic solution, Chen and Duvall (2014) propose The Basic Contextual Binding Condition: the Operational-Level Cybersecurity Strategy Formation Assume X is an entity, and CO is a contextual operator. Framework, which consists of a Contextual Analysis Com- (i) If X is directly related to CO in such a setting: ponent among other components in making strategic deci- COi {……Xi……} sions. This paper further explores the inner-workings of the then Xi is contextually bound by COi. Contextual Analysis Component. The entity Xi is directly related to COi iff COi provides a context that the interpretation of Xi solely depends on. Now, the contextual relationship in (1) can be captured Proposal in the following schematic configuration: CObook {……Xbook……} A novel approach is proposed here in this section, where Applying the Basic Contextual Binding Condition, the the contextual binding relationship is explored and then pronoun “it” is contextually bound by the contextual opera- used in detecting abnormal behavior. Within a contextual tor “book”. This means that the pronoun “it” has to be in- binding relationship, a contextual binding operator plays a terpreted as “the book” if the Basic Contextual Binding crucial role. It helps to set up the baseline in a context, in Condition is obeyed. which the Detection Condition can be applied. Given COs = {agent, patient, activity, time, location, environment, background, precedence, etc.}, the Restric- The Contextual Binding Conditions tive Contextual Binding Condition can also be defined. A contextual binding operator is deterministic in disam- Before we define this condition, let us see how McCar- biguation. Let us take a look at a linguistic example thy (1993) handles the time component. In discussing the demonstrated in the English sentence in (1) below. relations among contexts, McCarthy (1993) examines the (1) John likes to buy a book i and read iti within three specialize-time (t, c), which he refers to as “a context relat- days. ed to c in which the time is specialized to have the value t”. Any English speaker knows that the following interpre- The axiom that he comes up with is as follows: tation is acceptable: “John likes to buy a book and read the C0: specialize-time (t, c1, c2) ˄ ist (p, c1) ⊃ ist (c2, at- book that he buys within three days.” Any English speaker time (t, p)) also knows that the interpretation below is not acceptable: This axiom refers to two assertions. The first one is in Con- “John likes to buy a book and read the map within three text1, i.e. c1, the proposition p is true in the context of c1. days.” The second one is in Context2, i.e. c2, which is a subset of The pronoun “it” in this sentence refers back to the noun the set c1, and which has the specialize-time t, the proposi- phrase “the book”, which precedes the pronoun in the same tion is true at time t. The second assertion is a subset of the first assertion. From this perspective, the time component further nar- The Detection Condition rows down the interpretation of the entity with the time The Detection Condition can be derived from the above aspect. two conditions: Let us take a look at another linguistic example demon- (iii) If X is in such a contextual binding configuration: strated in the English sentence in (2) below. COi {……Xm……} (2) Yesterday John bought a booki at the bookstore. He where Xm is supposed to be contextually bound by enjoyed reading iti. COi but not so, then an abnormal circumstance is de- Here, the pronoun “it” in the second sentence refers back tected. to the noun phrase “a book”, sitting in the object position Likewise, of the first sentence and serving as the patient of the action. (iv) If X is in such a contextual binding configuration: The temporal adverbial phrase “yesterday” refers to the COi[Timej, Localityk] {……Xm[Yj, Zk]……} time component of the context. The locality adverbial where Xm[Yj, Zk] is supposed to be contextually phrase “at the bookstore” refers to the locality component bound by COi[Timej, Localityk] but not so, then an of the context. Comparing the noun phrase “a book” in (1) abnormal circumstance is detected. with the noun phrase “a book” in (2), one may notice that As shown above, a contextual operator helps to form the the former refers to a general book while the latter refers to contextual binding relationship and to resolve ambiguity. A a specific book, i.e. the book bought yesterday at the deception can be detected if the entity is supposed to be bookstore. In this sense, the latter in (2) may be considered contextually bound by a contextual operator but it is not so as a subset of the former in (1). in a configuration. Following McCarthy (1993), the assertion within a spe- Let us apply these conditions to the case in (1). cialized time, location, environment, and/or background is In (1), there is such a configuration: treated as a subset of a general assertion. COwhat {……Xwhat……} This can be captured schematically as follows in defin- This can be rewritten as follows: ing the Restrictive Contextual Binding Condition. CObook {……Xbook……} The Restrictive Contextual Binding Condition: Here, the pronoun “it” possesses the property “Xbook”, Assume X is an entity, and CO is a contextual operator. which is contextually bound by CObook. Hence, this inter- (ii) In a specialized time, location, environment, back- pretation is valid and acceptable. ground, if X[Y,Z] is directly related to CO[Time, Locali- However, if the pronoun “it” in (1) refers to another en- ty] in such a setting: tity, say “the map”, rather the entity “the book” that is COi[Timej, Localityk] {……Xi[Yj, Zk]……} mentioned in the first sentence, this contextual binding then Xi[Yj, Zk] is contextually bound by COi[Timej, relationship immediately ceases to exist. Below is the con- Localityk]. figuration: The entity Xi[Yj, Zk] is directly related to COi[Timej, CObook {……Xmap……} Localityk] iff (if and only if) COi[Timej, Localityk] provides Here, the pronoun “it”, which possesses the property a context that the interpretation of Xi[Yj, Zk] solely depends “Xmap”, is supposed to be contextually bound by the con- on. textual operator “COmap” but not be contextually bound by Obviously, COi[Timej, Localityk] is more restrictive than the contextual operator “CObook”. However, the contextual COi. In this sense, the Restrictive Contextual Binding Con- operator “COmap” is not available. Hence, such a configura- dition is a subset of the Basic Contextual Binding Condi- tion triggers the Detection Condition. The interpretation is tion. thus regarded as being invalid and unacceptable. Now, the contextual relationship in (2) can be captured Let us look at another linguistic example demonstrated in the following schematic configuration: in the English sentence in (3) below. CObook[Timeyesterday, Localityatbookstore] (3) * John likes to buy a booki and read themi within {……Xbook[Yyesterday, Zatbookstore]……} three days. Applying the Restrictive Contextual Binding Condition, Any speaker of English knows that this sentence is the pronoun “it” is contextually bound by the contextual awkward in the context where the pronoun “them” refers operator “CObook[Timeyesterday, Localityatbookstore]”. This back to the noun phrase “a book”, as there is a mismatch means that the pronoun “it” has to be interpreted as “the between the third-person singular form and the third- book bought yesterday at the bookstore” if the Restrictive person plural form. Contextual Binding Condition is obeyed. The contextual relationship can be captured in the fol- lowing schematic configuration: CObook {……Xbooks……} Here, Xbooks is supposed to be contextually bound by abnormal behavior can be easily detected with the help of CObook but not so. Hence, an abnormal circumstance is the Basic Contextual Binding Condition and the Detection detected. Condition. Let us have a look at still another linguistic example Assume what is expected for the original functionality of demonstrated in the English sentences in (4) below. the executable is contained inside a contextual operator as (4) * John likes to buy a cookbooki and cook iti follow- a feature set. Assume the actual functionality of the exe- ing the instruction. cutable is contained within a variable as current features. Any speaker of English knows that it is awkward to have The variable, by definition, should be contextually bound the pronoun “it” in this context to refer back to the noun by the contextual operator. Schematically, this relationship phrase “a cookbook”, because the noun phrase “a cook- is represented below: book” possesses the features: [+Object, -edible] while the COTextEditing {……XTextEditing……} pronoun “it” possesses the features: [+Object, +edible] in This represents a normal situation, in which an executable the sub-context of “cooking”. This mismatch in features is doing what it is expected to do. indicates that the noun phrase “a cookbook” and the pro- When an extra functionality is added into this executa- noun “it” refer to different entities. In other words, the pro- ble, the contextual relationship gets changed, as shown noun “it” is not contextually bound by the contextual oper- below: ator “a cookbook” in this particular case. COTextEditing {……XTextEditing+FileTransfer……} The contextual relationship can be captured in the fol- Here, one of the actual functionalities of the executable, lowing schematic configuration: i.e. “FileTransfer”, is not contextually bound by the con- COcookbook {……Xediblething……} textual operator. Thus, an unacceptable behavior is imme- As the pronoun “it” is not contextually bound by diately detected at the application level, even before it is COcookbook in its contextual domain, another abnormal cir- executed and at the time when a request for extra resource cumstance is detected. utilization is made. Assuming whatever is within the contextual operator is This also applies to other pieces of malware, which al- normal, the variable is normal if and only if it is contextual ways make requests for additional resource utilization. If bound by its corresponding contextual operator. As shown this contextual analysis component is implemented within above, in order to be properly bound in its contextual do- the kernel of an operating system, anytime when a request main, the variable has to possess the same features or for resource utilization is received, if it is not contextually properties as those of the contextual operator. Any devia- bound by a contextual operator, the request is denied im- tion triggers the Detection Condition. mediately, an investigation is launched, and this activity is logged. Let us check manipulation now. Stegonography is a Deception Detection good example of manipulation. For instance, one may hide In this section, the Contextual Binding Conditions and the a text file inside a graphic file. After this operation, the file Detection Condition are applied in the detection of decep- size of the modified graphic file may remain the same as tion. Assume that what an application or an executable is the file size of the original graphic file. At the first glance, expected to do on the basis of its functional requirement is nothing seems to have happened. However, using a digital included in the feature set of the contextual operator. As a forensic tool, one would see the systematic change of hex- result, this sets up the baseline for the application or the adecimal code even though the change for each byte is executable. The actual execution of the application or the minor, say the change from “0x52” to “0x51” in one byte executable is a variable, which should be bound by the and the change from “0x73” to “0x72” in another byte that contextual operator. If the actual execution involves more is 3 bytes after the previously changed byte. This becomes features than or different features from what is contained in obvious when one compares the code for the original the contextual operator, the deviation from being normal is graphic file with the code for the modified graphic file. In identified, the Detection Condition is triggered, and a de- addition, the original file timestamp pattern, consisting of ception is detected. the date created time, the date accessed time, the date mod- Let us examine fabrication first. A piece of malware is a ified time, and the date last saved time, is changed. Evi- good example of fabrication. For instance, appended to the dently, the Restrictive Contextual Binding Condition is executable “notepad.exe” is a piece of code that makes violated. Hence, the abnormal behavior in this type of cas- possible for the executable to perform file transfer in addi- es can also be detected. tion to its original functionality of text file editing. This Assume both the expected code pattern and the expected timestamp pattern are contained within the feature set of the contextual operator. Assume the actual code pattern and the actual timestamp pattern are contained as current features in the variable. Also assume that the timestamps (LNAI) 2680. Blackburn, P. et al. (Eds). pp.94-106. Berlin: are used to further restrict the actual code pattern, as illus- Springer-Verlag. trated in the Restrictive Contextual Binding Condition. By Burgoon, J., Blair, J., Qin, T., and Nunamaker, J. 2003. Detecting definition, the variable should be contextually bound by Deception through Linguistic Analysis. Proceedings of First NSF/NIJ Symposium on Intelligence and Security Informatics. the contextual operator. Schematically, this relationship is pp.91-101, Berlin: Springer-Verlag. represented below: Caddel, J. 2004. Deception 101 - Primer on Deception. Strategic COPattern1[TimePattern2]{……XPattern1[YPattern2]……} Studies Institute, U.S. Army War College. This represents a normal situation, in which an expected Chen, J. & Duvall, G. 2014. On Operational-Level Cybersecurity pattern is obtained. Strategy Formation. Journal of Information Warfare, 13 (3), When a graphic file is altered to accommodate a hidden pp.79-87. text file, the actual code pattern gets changed. Now, the Gaifman, H. 2008. Contextual Logic with Modalities for Time contextual relationship also gets changed, as shown below: and Space. Review of Symbolic Logic, 1 (4), pp.433-458. COPattern1[TimePattern2]{……XPattern6[YPattern7]……} Grossi, D., Dignum, F., and Meyer, J. 2006. Contextual Termi- Here, the actual representation of the file is not contex- nologies. Computer Logic in Multi-Agent Systems (CLIMA) VI, Lecture Notes in Artificial Intelligence (LNAI) 3900. Toni, F & tually bound by the contextual operator, because the actual Torroni, P. (Eds). pp.284-302. Berlin: Springer-Verlag. code pattern represented by “XPattern6” is different from the Kay, P. 1989. Contextual Operators: Respective, Respectively, expected pattern “XPattern1” contained in the contextual op- and Vice Versa. Proceedings of the Fifteenth Annual Meeting of erator and the actual timestamp pattern represented by the Berkeley Linguistics Society. pp.181-192. “XPattern7” is different from the expected pattern “XPattern2” McCarthy, J. 1993. Notes on Formalizing Context. Proceedings contained in the contextual operator. Hence, the unac- of the 13th International Joint Conference on Artificial Intelli- ceptable behavior is detected at the code level. gence – Volume 1. pp.555-560. San Francisco, CA: Morgan As shown above, the Contextual Binding Conditions and Kaufmann Publishers Inc. the Detection Condition can successfully detect deception Rebuschi, M. & Lihoreau, F. 2009. Contextual Epistemic Logic. Retrieved from such as fabrication and manipulation. http://www.academia.edu/8052225/Contextual_Epistemic_Logic. Zhou, L., Shi, Y., and Zhang, D. 2008. A Statistical Language Conclusion Modeling Approach to Online Deception Detection. IEEE Trans- actions on Knowledge and Data Engineering, 20 (8). pp.1077- Detecting deception in cyberspace is a challenge. Based on 1081. the analysis of the unique property of contextual operators in a natural language, this paper proposes a contextual binding mechanism that can be used to disambiguate inter- pretation and identify invalid and unacceptable interpreta- tion in a natural language. The same mechanism can also be used to detect deception in the cyber domain, specifical- ly fabrication and manipulation. This mechanism can not only aid the decision-making in cyber conflicts or cyber competitions but also lay the foundation for employing contextual operators in an artificial intelligence system. References Almeshekah, M. and Spafford, E. 2014. Planning and Integrating Deception into Security Defenses. The New Security Paradigm Workshop (NSPW 2014), Retrieved from http://www.meshekah.com/wp- con- tent/uploads/2014/10/planning_and_integrating_deception_into_c omputer_security_defenses_Almeshekah_Spafford.pdf. Brézillon, P. 1999. Context in problem solving: A survey. The Knowledge Engineering Review, 14 (1), pp.1-34. Brézillon, P. 2002. Modeling and using context: Past, present, and future. Rapport de Recherche du LIP6 2002/010, Université Paris 6, France. Brézillon, P. 2003. Context Dynamic and Explanation in Contex- tual Graphs. Context 2003. Lecture Notes in Artificial Intelligence