Intelligence Analysis Ontology for Cognitive Assistants Mihai Boicu, Gheorghe Tecuci and David Schum  Abstract—This paper presents results on developing a general considerations answer the question: How strongly does this intelligence analysis ontology which is part of the knowledge base item of evidence favor or disfavor alternative hypotheses we of Disciple-LTA, a unique and complex cognitive assistant for are considering? Establishing these three evidence credentials evidence-based hypothesis analysis that helps an intelligence always involves mixtures of imaginative and critical analyst cope with many of the complexities of intelligence analysis. It introduces the cognitive assistant and overviews the reasoning. Indeed, as work on an analytic problem proceeds, various roles and the main components of the ontology: an we commonly have evidence in search of hypotheses at the ontology of “substance-blind” classes of items of evidence, an same time with hypotheses in search of evidence. First, various ontology of believability analysis credentials, and an ontology of hypotheses and lines of inquiry must be generated by analysts actions involved in the chains of custody of the items of evidence. who imagine possible explanations for the continuous occurrence of events in our non-stationary world. Second, Index Terms—cognitive assistant, ontology, evidence-based considerable imagination is required in decisions about what hypothesis analysis, types of items of evidence, chains of custody items of information should be considered in the analytic problem at hand. But critical reasoning in intelligence analysis I. THE COMPLEXITY OF INTELLIGENCE ANALYSIS is equally important. No item of evidence comes with its relevance, credibility, and inferential force or weight I ntelligence analysts face the difficult task of analyzing masses of information of different forms and from a variety of sources. Arguments, often stunningly complex, are credentials already established. These credentials must be established by defensible and persuasive arguments which have to take into account that our evidence is always necessary in order to link evidence to the hypotheses being incomplete, usually inconclusive, frequently ambiguous, considered. These arguments have to establish the three major commonly dissonant, and it comes to us from sources having credentials of evidence: its relevance, credibility, and any gradation of credibility shy of perfection [1]. inferential force or weight. Relevance considerations answer But the inherent complexity of the analysts' tasks are only the question: So what? How does this item of information bear part of their problems. In many cases, analysts are not given on any hypothesis being considered? Credibility unlimited time to generate hypotheses and evidence and to considerations answer the question: Can we believe what this construct elaborated and careful arguments on all elements of item of information is telling us? Inferential force or weight the analysis at hand. One way of describing this problem is to say that analysts will neither have the time, or the necessary Manuscript received October 31, 2008. This work was supported in part by several U.S. Government organizations, including the Air Force Office of evidential basis, for drilling down or decomposing all elements Scientific Research (FA9550-07-1-0268), the Air Force Research Laboratory of the problem being considered. In many instances, analysts (FA8750-04-1-0257), and the National Science Foundation (0750461). The are faced with the necessity of having to make various US Government is authorized to reproduce and distribute reprints for Governmental purposes notwithstanding any copyright notation thereon. The assumptions in which certain events are believed "as if" they views and conclusions contained herein are those of the authors and should actually occurred. And always, the world is evolving and the not be interpreted as necessarily representing the official polices or yesterday’s analysis needs to be updated with new items of endorsements, either expressed or implied, of the Air Force Office of Scientific Research, the Air Force Research Laboratory, the National Science evidence discovered today. Foundation or the U.S. Government. Dr. Mihai Boicu is Assistant Professor of Applied Information Technology II. DISCIPLE-LTA: ANALYST’S COGNITIVE ASSISTANT and Associate Director of the Learning Agents Center in the Volgenau School of Information Technology and Engineering, George Mason University, 4400 Disciple-LTA is a unique and complex analytic tool that can University Dr., Fairfax VA 22030, USA (phone: 703-993-1591; fax: 703- help an intelligence analyst cope with many of the 993-9275; e-mail: mboicu@gmu.edu). complexities of intelligence analysis [2], [3]. The name Dr. Gheorghe Tecuci is Professor of Computer Science in the Volgenau School of Information Technology and Engineering and Director of the Disciple, by itself, suggests that it learns about intelligence Learning Agents Center at George Mason University, Fairfax VA 22030, analysis through its interaction with experienced intelligence USA, and Visiting Professor and former Chair of Artificial Intelligence at the analysts. The word "disciple" has synonyms including: learner, US Army War College (e-mail: tecuci@gmu.edu). Dr. David Schum holds the rank of Professor in the Systems Engineering advocate, supporter, and proponent. The addition "LTA", and Operations Research Department in the Volgenau School of Information refers to the fact that Disciple learns analysis [L], it can serve Technology and Engineering, and in the School of Law, at George Mason as a tutor [T] for novice and experienced analysts, and it can University, Fairfax VA 22030, USA. He is also Honorary Professor of assist [A] in the performance of analytic tasks, e.g. in current Evidence Science, University College London, UK (e-mail: dschum@gmu.edu). or in finished intelligence analyses. Disciple-LTA has two very distinct differences from other knowledge-based or rule-based analyst the opportunity to decompose a complex problem into "expert systems" developed in the field of artificial finer levels; i.e. it rests upon a "divide and conquer" strategy intelligence over the years. Such systems are developed by for dealing with the analytic complexity of hypothesis in knowledge engineers who attempt to capture and represent the search of evidence. In particular, it allows "top-down" heuristics or rules of the experienced expert users so that they decompositions to deduce from a stated hypothesis what needs could be preserved and utilized in new situations. This is a to be proven in order to sustain this hypothesis. This very long and difficult process that results in systems that are decomposition eventually results in the identification of even more difficult to maintain. But Disciple-LTA is possible sources of evidence relevant to this hypothesis. qualitatively different from these earlier expert systems. Consider, for example, the problem of assessing whether Al Instead of being programmed by a knowledge engineer, Qaeda has nuclear weapons. This problem can be reduced to Disciple-LTA learns its expertise directly from expert analysts three simpler problems of assessing whether Al Qaeda has who can teach it in a way that is similar to how they would reasons, has desires, and has ability to obtain nuclear weapons. teach a person. However, when it is first used by an expert Each of these simpler problems is further reduced to even analyst, Disciple-LTA does not engage in this interaction with simpler ones (e.g. by considering specific reasons, such as a blank mental tablet. Disciple-LTA already has a stock of deterrence, self-defense, or spectacular operation) that could established knowledge about evidence, its properties, uses, and be solved either based on the available knowledge or by discovery. Some of this knowledge may not be already analyzing relevant items of evidence. An abstraction of these resident in the minds of its expert users, who apply their decompositions is presented in the left-hand side of Fig. 1. Let experience with certain analytic contexts that Disciple will us consider “Spectacular operation as reason” which is a short learn. So, Disciple does learn about specific intelligence name for “Assess whether Al Qaeda considers the use of problems from its users, but it can combine this knowledge nuclear weapons in spectacular operations as a reason to with what it already knows about various elements of obtain nuclear weapons.” As indicated in the left-hand side of evidential reasoning. Conventional expert systems can be no Fig. 1, to solve this hypothesis analysis problem Disciple-LTA better than the expertise of the persons whose heuristics are considered both favoring evidence and disfavoring evidence. trapped; this represents a "ceiling" on the suitability of these Disciple-LTA has found two items of favoring evidence, EVD- earlier systems. But this ceiling is actually the "floor" for FP-Glazov01-01c and EVD-WP-Allison01-01, and it has Disciple-LTA, since this system incorporates basic knowledge analyzed to what extend each of them favors the hypothesis of the evidential reasoning tasks analysts face in addition to the that Al Qaeda considers the use of nuclear weapons in substantive expertise of the analysts who interact with it. spectacular operations as a reason to obtain nuclear weapons. One basic feature of Disciple-LTA is that it provides the EVD-FP-Glazov01-01c is shown in the bottom right of Fig. 1. Detailed evidence and source analysis EVD-FP-Glazov01-01c Fig. 1. Hypothesis analysis through problem reduction and solution synthesis. It is a fragment from a magazine article published in the Front separation at the level of the knowledge base. Its knowledge Page Magazine by Glazov J. where he cites Treverton G. who base is structured into an object ontology that defines the stated that Al Qaeda may perform a spectacular nuclear attack concepts of the application domain, and a set of problem against United States [4]. To analyze EVD-FP-Glazov01-01c, solving rules expressed in terms of these concepts. While an Disciple-LTA considered both its relevance and its ontology is characteristic to an entire domain (such as believability [1], [5]. The believability of EVD-FP-Glazov01- intelligence analysis), the rules are much more specific, 01c depends both on the believability of Glazov J. (the corresponding to a certain type of applications in that domain, reporter of this piece of information) and the believability of and even to specific subject matter experts. This separation Treverton G. (the source). The believability of the source allows one to easily share and reuse the ontology developed depends on his competence and his credibility. The credibility for a given intelligence analysis application, when developing of Treverton G. depends on his veracity, objectivity, and a new one. Additionally, the ontology in Disciple-LTA is analytical ability. When the analyst clicks on a problem, such organized as a distributed hierarchy of several ontologies, as “Credibility” from the left-hand side of Fig. 1, Disciple- which further facilitate its sharing and reuse, as well as its LTA displays the details on how it solved that problem, as development and maintenance. shown in the right-hand side of Fig. 1. For example, to “Assess the credibility of Treverton G as the source of EVD-FP- IV. MULTIPLE ROLES FOR ONTOLOGY Glazov01-01c” Disciple-LTA has assessed his veracity, The object ontology plays a crucial role in Disciple-LTA objectivity, and analytical ability. Then the results of these and in cognitive assistants, in general, being at the basis of assessments (almost certain, almost certain and almost certain) knowledge representation, user-agent communication, problem have been combined into an assessment of the credibility solving, knowledge acquisition and learning [6]. First, the (almost certain). Disciple-LTA may use different synthesis object ontology provides the basic representational functions for the solutions (such as, minimum, maximum, constituents for all the elements of the knowledge base, average, etc.), depending on the types of the problems. A including the problems, the problem reduction rules, and the abstraction of the synthesis process is displayed in the left solution synthesis rules. The ontology language of Disciple- hand side of Fig. 1, where the solutions appear in green, LTA is an extension of OWL-light [7] that allows the attached to the corresponding problems. Notice that this representation of partially learned concepts and features. A problem-reduction/solution-synthesis approach enables a partially learned feature may have both it domain and its range natural integration of logic and probability. represented as plausible version space concepts [6]. One may In some situations the analysts will not have the time to deal also define different symbolic probability scales, such as Kent, with all of the complexities their own experience and Disciple- DNI, IPCC or legal [8], and automatically convert from one to LTA makes evident. In other situations, analysts will not have another and into the Bayesian probabilities. For example, the access to the kinds of information necessary to answer all left hand side of Fig. 2 shows the symbolic probabilities for questions regarding elements of an analysis that seem likelihood, based on the DNI’s standard estimative language, necessary. In such situations Disciple-LTA allows the user to while the right hand side decompose (“to drill down”) an analysis to different levels of shows the corresponding refinement in order to reach conclusions about necessary Bayesian probability analytic ingredients, by providing mechanisms necessary to intervals. The ontology identify assumptions that are being made and by showing the also allows the extent to which conclusions rest upon these assumptions [3]. representation of items of For evidence in search of hypotheses, Disciple-LTA allows evidence that may contain the construction of "bottom-up" structures in which possible different or even alternative hypotheses are generated. No computer system, contradictory views on even Disciple-LTA, is capable of the imaginative thought some entities. Fig. 2. Symbolic probabilities for likelihood. required to generate hypotheses and new line of inquiry. But Second, the agent’s ontology enables the agent to Disciple-LTA can assist in this process by prompting the communicate with the user and with other agents by declaring analyst to consider the inferential consequences of chains of the terms that the agent understands. As illustrated in the thought that occur in the process of generating hypotheses and upper-right part of Fig. 1, the agent uses natural language new lines of inquiry and evidence. phrases where the terms from the ontology appear in blue. The following sections will discuss the general features of Consequently, the ontology enables knowledge sharing and the intelligence analysis ontology of Disciple-LTA. reuse among agents that share a common vocabulary which they understand. Third, the problem solving rules of the agent III. KNOWLEDGE BASE STRUCTURE FOR SHARING AND REUSE are applied by matching them against the current state of the In addition to the separation of knowledge and control agent’s world which is represented in the ontology. The use of (which is a characteristic of all the knowledge-based systems), partially learned knowledge (with plausible version spaces) in Disciple-LTA is characterized by an additional architectural reasoning, allows solving of problems with different degrees of confidence [2]. Fourth, the object ontology represents the information reported. Did this person make a direct generalization hierarchy for learning, general rules being observaton or did he/she learn about the occurrence or learned from specific problem solving examples by traversing nonoccurrence of the reported event from another person, in this hierarchy [2], [3], [6]. which case we have secondhand or hearsay evidence. Moreover, there are classes of evidence mixtures, such as V. ONTOLOGY OF “SUBSTANCE-BLIND” CLASSES testimonial evidence about tangible evidence. It would not be OF ITEMS OF EVIDENCE uncommon in intelligence analysis to encounter evidence Being able to categorize evidence is vitally necessary for obtained through a chain of sources (see section VII). many reasons, one of the most important being that we must ask different questions of and about our evidence in the VI. ONTOLOGY OF BELIEVABILITY ANALYSIS CREDENTIALS process of intelligence analysis in which we encounter As discussed above, the “substance-blind” ontology of different recurrent forms and combinations of evidence. If we classes of evidence is based on their believability and were not able to categorize evidence in useful ways we might relevance credentials. That is, there are specific credentials for not be aware of many different questions we should be asking each such class. For example, the believability of a source of of our evidence. However, asked to say how many kinds of direct testimonial evidence depends on the source’s evidence there are, we could easily say that there is near competence and credibility [1], [5]. Assessments of the infinite amount, if we considered its substance or content. This competence of a source require answers to two important presents a significant problem: how can we ever say anything questions. First, did this source have access to, or did actually general about evidence if every item of it is different from observe, the events being reported? If it is believed that a every other item? Fortunately there is a "substance-blind" way source did not have access to, or did not actually observe the of categorizing evidence that does not rely at all on its events being reported, we have very strong grounds for substance or content, but on its inferential properties: its suspecting that this source fabricated this report or was relevance and believability. instructed what to tell us. Second, we must have assurance that Disciple-LTA includes an ontology of “substance-blind” the source understood the events being observed well enough classes of items of evidence. Some of the classes based on to provide us with an intelligible account of these events. So, their believability attributes are shown in Fig. 3 [1]. access and understanding are the two major attributes of a human source's competence. Assessments of human source credibility require consideration of entirely different attributes: veracity (or truthfulness), objectivity, and observational sensitivity under the conditions of observation. Here is an account of why these are the major attributes of testimonial credibility. First, is this source telling us about an event he/she believes to have occurred? This source would be untruthful if he/she did not believe the reported event actually occurred. So, this question involves the source's veracity. The second question involves the source's objectivity. The question is: did this source base a belief on sensory evidence received during an observation, or did this source believe the reported event Fig. 3. “Substance-blind” classes of items of evidence. occurred either because this source expected or wished it to occur? An objective observer is one who bases a belief on the If you can pick up the evidence yourself and examine it to basis of sensory evidence instead of desires or expectations. see what events it might reveal, we say the evidence is tangible Finally, if the source did base a belief on sensory evidence, in nature such as objects, documents, images, and tables of how good was this evidence? This involves information about measurements. We distinguish between real tangible evidence the source's relevant sensory capabilities and the conditions which is an actual thing itself (such as a captured weapon under which a relevant observation was made. component), and demonstrative tangible evidence, which is a Answers to these competence and credibility questions representation or illustration of this thing (such as a diagram of require information about our human sources. But one thing is that component). Now suppose you have nothing you can abundantly clear: the competence and credibility of HUMINT examine for yourself and must rely on someone else who has sources are entirely distinct. Competence does not entail made some observation and who will tell you about the credibility, nor does credibility entail competence. Confusing occurrence or nonoccurrence of some event. This is called these two characteristics invites inferential disaster Error! testimonial evidence, as in a HUMINT report from an asset. Reference source not found.. Disciple-LTA includes an This person may state unequivocally that some event has ontology of these credentials and Fig. 1 shows an example of occurred or has not occurred. Of great concern is how the using such credentials in analyzing the believability of an item person providing testimonial evidence obtained the of evidence. VII. ONTOLOGY OF ACTIONS FROM CHAINS OF CUSTODY ontology of actions that may be involved in a wide variety of A crucial step in answering questions on the believability of chains of custody for different types of evidence, such as the items of evidence involves having knowledge about the HUMINT, IMINT, SIGINT or TECHINT. For example, Fig. 5 chain of custody through which the testimonial or tangible shows the representation of a translation action. The item has passed en route to the analyst who is charged with believability of this translation depends both on the translator’s assessing it. Basically, establishing a chain of custody involves competence (in the two languages, as well as the subject matter identifying the persons and devices involved in the acquisition, being translated) and on his/her credibility. processing, examination, interpretation, and transfer of VIII. LESSONS AND STORIES ABOUT INTELLIGENCE ANALYSIS CONCEPTS Disciple-LTA can be used to helps new intelligence analysts learn the reasoning processes involved in making intelligence judgments and solving intelligence analysis problems. In particular, its ontology includes lessons and stories about a wide range of intelligence analysis concepts, such as the lesson on veracity illustrated in Fig. 6 [5]. Moreover, its stock of established knowledge about evidence, its properties, uses, and discovery, makes it a suitable educational tool Fig. 4. Evaluating the believability of an item of evidence obtained through a chain of custody. even for expert analysts. evidence between the time the evidence is acquired and the time it is provided to intelligence analysts. Lots of things may have been done to evidence in a chain of custody that may have altered the original item of evidence, or have provided an inaccurate or incomplete account of it. In some cases original evidence may have been tampered with in various ways, the analysts risking of drawing quite erroneous conclusions from the evidence they receive. Suppose we have an analyst who is provided with an item of testimonial evidence by an informant who speaks only in a foreign language. We assume that this informant's original testimony is first recorded by one of our intelligence professionals; it is then translated into English by a paid translator. This translation is then edited by another intelligence professional; and then the edited version of this translation is transmitted to an intelligence analyst. So, there Fig. 6. Fragment from the lesson on veracity. are four links in this conjectural chain of custody of this original testimonial item: recording, translation, editing, and REFERENCES transmission. Various things can happen at each one of these [1] Schum D.A., The Evidential Foundations of Probabilistic Reasoning, links that can prevent the analyst from having an authentic Northwestern University Press, 2001. account of what our source originally provided. Fig. 4 shows [2] Tecuci G., Boicu M., Marcu D., Boicu C., Barbulescu M., Ayers C., how the believability of the testimonial evidence provided to Cammons D., Cognitive Assistants for Analysts, Journal of Intelligence Community Research and Development, 2007. the analyst (EVD-Wallflower-5) depends on the believability [3] Tecuci G., Boicu M., Marcu D., Boicu C., Barbulescu M., Disciple- of the testimony of the informant (i.e. EVD-Wallflower-1), but LTA: Learning, Tutoring and Analytic Assistance, Journal of also on the believability of the Recording, Translation, Intelligence Community Research and Development, 2008. Editing, and Transmission actions. Disciple-LTA has an [4] Glazov J., Symposium: Diagnosing Al Qaeda, FrontPageMagazine.com, 18 August 2003. [5] Schum D.A., Lessons and Stories about Concepts Encountered in Disciple-LTA, Research Report 2, Learning Agents Center, 2007. [6] Tecuci G., Boicu M., A Guide for Ontology Development with Disciple, Research Report 3, Learning Agents Center, 2008. [7] W3C, W3C Recommendation, OWL Web Ontology Language Overview, http://www.w3.org/TR/owl-features/, February 2004. [8] Weiss, C., Communicating Uncertainty in Intelligence and Other Professions, International Journal of Intelligence and CounterIntelligence, 21:1, pp 57-85, 2008. Fig. 5. Action involved in a chain of custody for an item of evidence.