Credibility Models Charles R. Twardy, Edward J. Wright, Stephen J. Canon & Masami Takikawa∗ Information Extraction & Transport, Inc. 1911 N. Fort Myer Dr., Suite 600 Arlington, VA 20190 {ctwardy,ewright,scannon,mtakikawa}@iet.com Revision : 1.6 Abstract We present a general hierarchical Bayesian model where Intelligence Sources make Re- ports about events or states in the world, which we call Hypotheses. The underlying multi-entity Bayes net for even a simple sce- nario has hundreds of nodes. We hide the details via Wigmore diagrams and a Google Maps GUI. Our application domain is Intel- ligence data fusion in asymmetrical warfare (terrorism). Some Hypotheses – like whether a village is a threat – may be abstract or un- observable. For these, we define Indicators – more observable Hypotheses whose value has some bearing on the target Hypothesis. The hierarchy can be arbitrarily deep, and Re- ports can provide evidence at any level. Fur- thermore, all Sources have credibility models. Traditional Sources are physical sensors with well-known error models. Non-traditional Sources include humans, websites, news, etc. Figure 1: Initial screenshot showing the map window, For these Sources, our credibility models in- the Sources dock, and incoming evidence. The hypo- clude Hypotheses about unknown factors like thetical village of Kafur is shown with the tractor logo, objectivity, competence, accuracy, reliabil- in the upper left. The screenshot has been altered to ity, and veracity. Every Report by a Source fit. provides evidence about those factors. So, for example, successful ad hominem attacks against one Source can undermine his assur- ances that a village is safe, and lead us to believe it is hostile after all. and against various hypotheses. We do that via a com- plex, hierarchical Bayesian network. However, because Intelligence Analysts do not wish to encounter the full 1 INTRODUCTION computational complexity of the resulting multi-entity model, we provide simplified views via Wigmore dia- Our domain is structured evidential reasoning, espe- grams centered on specific hypotheses. Even so, there cially Intelligence analysis. Our task is to reason sci- are too many of these, so we attach them to specific ac- entifically about the credibility of Intelligence sources, tors in the world. These actors are presented in a map- so that we may properly weigh conflicting evidence for based GUI like that shown in Figure 1. In our model, ∗ Now with Cleverset, Inc., Dr. Takikawa can also be there are three main classes: Hypotheses, Sources, and reached at takikawa@cleverset.com. Reports. When capitalized, these refer to classes or objects in our model.1 We begin with a use case involving a fictional farm- ing and fishing village named Kafur, a few Intelligence sources, and reports from those sources. This morn- ing, electronic Signals Intelligence (SIGINT) reported that Kafur received a suspiciously large shipment of fertilizer. We are concerned that it may be used for explosives. Figure 1 shows our initial state. The belief bar under Kafur shows its perceived threat level – a de- fault 50% for an unknown place. Our Intelligence en- cyclopedia reports that Kafur’s allegiance is 80% Blue Figure 2: Wigmore diagram showing lines of evidence – which means favorable to us. We drag that Report for Kafur.threat (top), after Joe’s HUMINT report. For from Incoming Evidence onto Kafur’s icon. A dialog clarity, we have added short names in large bold, and box lets us confirm the Source, set the precise Hypoth- italicized the unique ID’s. esis to Kafur.allegiance, and the value to “Blue”. En- cyclopedia has medium credibility , enough to move Kafur’s threat level comfortably into the green . Now we apply this morning’s SIGINT Report from “Sensor 1” onto Kafur. For dramatic effect, we con- sider it direct evidence that Kafur is a threat, so we apply it to Kafur.threat, with the value True. Because Sensor 1 is a reliable source , Kafur’s threat Figure 3: Wigmore diagram for SIGINT credibility, after filing one Report. (Some text rearranged) probability moves into the red zone. We begin talking to Sources. Joe reports that the fertilizer is in fact going to farmers Joe) track objectivity, competence, accuracy, and ve- who live outside the village. This drops the threat racity, following [Schum, 1997].3 Joe’s initial credibil- probability into the yellow zone, but Joe is a relatively ity model is shown in Figures 4 and 5. new source. We’re waiting to hear from our agent Carl. While we wait, we view the Wigmore2 diagram showing the lines of evidence for Kafur.threat. (See Figure 2.) Reports from Joe and SIGINT directly influence Ka- fur.threat, while the Encyclopedia report contributes via Allegiance. Different sources, in turn, may have different credibility models. For example, the credibil- ity model for SIGINT looks like Figure 3. The sensor attributes TPrate, FPrate, and Reliability (second row) determine probabilities for Boolean report attributes Figure 4: Wigmore diagram of Joe’s initial credibility TP, FP, and Working, respectively (third row). These model. (Some text rearranged) in turn influence the report’s status (the bottom red circle). At this point, Carl reports two things. First, that he has independent reason to think Joe is lying about the The credibility models for HUMINT sources (like fertilizer. Depending on whether we take Carl to mean 1 Joe is lying just on this occasion or habitually, we can We capitalize Intelligence to distinguish the govern- ment activity from the human attribute. Attributes appear apply this to Joe’s report (via the tellingTruth) node, like so: accuracy. or directly to Joe’s credibility model (via the veracity 2 node). Here we take the second (more serious) case In a Wigmore diagram, arrows show the direction of inference, from evidence to hypothesis. [Wigmore, 1931] 3 BN arrows are often in the opposite direction. We apply them concurrently; Schum uses a chain. Figure 5: The BN fragment for Joe’s initial credibility – Carl isn’t just saying the coin came up tails. He’s saying it’s weighted to favor tails. Because Carl has a very high credibility, his report casts serious doubt on Joe’s report. Consequently, Kafur’s threat level increases . Carl also says that Joe has been attending several sub- versive meetings. We might take this as evidence that Joe is a DoubleAgent, but instead we apply it to his objectivity, which affects his credibility. Again, we drag Carl’s Report directly onto Joe. By now, Joe’s cred- Figure 7: BN fragment for Joe’s final credibility model ibility is down to about 50% . The Wigmore diagram (Figure 6) shows all the evidence influencing Joe’s credibility. Kafur’s threat level is still in the red 2 MODEL STRUCTURE zone, though down slightly. Fundamentally, we have a system for reasoning from observations to hypotheses, accounting for the cred- ibility of sources and the relevance of observations for different hypotheses. It extends and generalizes [Wright and Laskey, 2006] by making report credibil- ities derive from source credibilities, and by defining general multi-entity Bayesian network (MEBN) frag- ments4 that serve for both the scenario and the credi- bility model. There are three main classes: Hypothe- ses, Sources, and Reports. Figure 6: Wigmore view of Joe’s final credibility model Each class includes several Bayesian network nodes, and references other classes. For example, Reports Compare that Wigmore diagram (Figure 6) to the un- necessarily have a Source and a target Hypothesis. derlying Bayes net fragment, shown in Figure 7. As (These typed links give a MEBN system the expres- we connect reports, the BN becomes complicated. And sive power of first-order logic, and thereby support dy- recall that this is just the fragment dedicated to Joe’s namic construction of the network.) credibility, which is not our main concern. The entire One key to our model’s success is that nearly ev- BN has several hundred nodes, as shown in Figure 8. ery attribute is a Hypothesis, and therefore nearly As we discuss in Section 2, the model needs all that every claim can become the center of evidence and machinery to “do the right thing”, but the Wigmore reasoning – the top node in a Wigmore diagram. diagrams provide a much more accessible view of the We were especially interested in being able to reason underlying model. The analyst is seldom concerned 4 with, nor prepared to encounter, the full machinery. For MEBNs and fragments, see [Laskey, 2006]. Figure 8: The whole model view: each box is a frame, with one or more BN nodes inside. This frame view and the BN views come from our network visualizer. about the credibility of Reports and Sources. There- A quick word about our modeling language. The fore, key attributes such as Report.opportunity and models are defined in in Quiddity*Script (Q*S), Source.accuracy are themselves Hypotheses, for which IET’s own modeling language for MEBNs. Although we can define further Indicators, and to which we can Q*S has traditional classes, it uses a frame system apply evidence (i.e. Reports). for MEBN fragments: fragments are “frames” and BN nodes are one kind of “slot” that can inhabit a frame. So for example5 : 2.1 HYPOTHESES frame Hypothesis Hypothesis slot status domain = Object distribution = UniformDiscreteDistribution Continuous Indicator By this definition, every Hypothesis (and descendant) Hypothesis has a status node. Because status has a distribution, it will become a BN node. For example, Kafur.threat is a default Boolean Hypothesis. Subclasses may rede- Continuous fine status. For example, ContinuousHypothesis sets Indicator domain = Continuous and distribution to a func- tion defined in a fn slot. Figure 9: Hypothesis hierarchy 2.1.1 Relevances 6 In our scenario, the main hypothesis was whether the Now, often we cannot apply evidence directly to village of Kafur posed a threat. In our system, this our main Hypothesis. Instead, we define Indicators. Hypothesis was one of the predefined attributes of all For example, Kafur.Allegiance is an Indicator for Ka- entities, Entity.threat. Insofar as possible, every ob- 5 The examples clean up the syntax for presentation. servable is a Hypothesis, or subclass thereof. There- 6 This section supplies technical detail that may be fore, everything can be observed and reasoned about, skipped. It shows how we have made Hypotheses as general including key attributes of our credibility model. as they are. fur.threat. We might define others, such as poverty tions, and fill in the various R tables for us. In many level, economic instability, youth bulge, presence of cases, we can define our relevance with a single num- militias or weapons, etc. Similarly, we might argue ber – a notional strength for the Indicator, even for for the credibility of a sensor by reference to sensitiv- discrete CPTs. For example, when H and I have the ity, specificity, and reliability. An Indicator need not same domain, we can use a single number to represent have the same domain as the Hypothesis it indicates, Pr(I = x|H = x), assuming the remainder is spread though it may. Later we will see how we use Indicators uniformly among the other states. in our agent credibility model. Some indicators are better than others, so we need 2.2 SOURCES a measure of strength. We call this the Relevance. We spend more time on Sources when discussing credi- Then, Indicator I is a function of Hypothesis H and bility models in Section 3. Now, it is sufficient to know Relevance R: I = f (H, R). For example, if I is nor- that all sources have a credibility, which is a Contin- mally distributed around H, with a standard deviation uousHypothesis ranging from [0..1], with 1 being per- given by R, we would say I = N (H, R). Graphically, fectly credible. Specific kinds of sources have specific H → I ← R. This could even be I = H + R, where attributes that determine the overall credibility. These R = N (0, x). But unless we need to vary the vari- attributes are themselves Hypotheses, usually Contin- ance (etc.) on the fly, we do not need actually need uousHypotheses. All agents have accuracy, objectiv- to create separate nodes for R (So far, our continuous ity, and competence. In addition, since people can lie, indicators have not needed to do so.) sources of type Person have veracity, which is a func- However, in the general discrete case, the indicator tion of whether they are a DoubleAgent. may well have a different CPT for each hypothesis state. We need to be able to use the proper one de- 2.3 REPORTS pending on our current beliefs about the hypothesis. We also wish to define a single class regardless how Source many states our Indicator and Hypothesis have. We can do so using reference uncertainty. Our Relevance R is actually a set of nodes – one per hypothesis state Event Credible Opportunity – defining the the indicator’s CPT for that hypothesis (Hypothesis) state. frame Indicator isa Hypothesis Report slot hypothesis domain = Hypothesis Figure 10: Report schema slot relevance domain = Relevance Every Report has opportunity and credible Boolean at- parents = [hypothesis.status, tributes. A generic Report does not know what kind relevance.hypothesisValue] # h, r of source it has, so its credible is a direct function of distribution = function h, r the source’s credibility, as well as opportunity: { if h == r then 1 else 0 end } slot status slot credible # domain is inferred parents = [source.credibility.status, parents = [relevance.status] opportunity.status] distribution = function r { r } domain = BooleanDomain distribution = function cred, opp { During inference, the correct r ∈ R is chosen by the if opp == false then return [1, 0]; distribution: else { c = cred->getMid(); function h, r {if h == r then 1 else 0 } return [1-c, c]; # [false, true] In effect, we change the CPT of status on the fly, ac- }; cording to our beliefs about the current hypothesis. end } Having defined a structure that can handle such a general case, we next seek to avoid having to fill in In contrast, a HUMINT report (see Figure 11) knows all those tables. So we have many special-case con- that the source is a Person, and defines the additional structors which make stronger independence assump- properties accurate, objective, and competent. Then, credible is a function of all these. (For now we simply 3 CREDIBILITY MODELS AND them together.) The underlying credibility model tells us how much slot credible to believe the claim, when a source makes a re- parents = [opportunity.status, port. With electronic sensors reporting on known competent.status, objective.status, events in known conditions, we usually have some accurate.status] information on accuracy, false positive rate, and of domain = BooleanDomain course, reliability. For human observers, David Schum distribution = function opp, comp, obj, acc [Schum, 1997, Schum, 1994] has created a detailed { return opp && comp && obj && acc; } credibility model, which is summarized in Figure 12 by Peter Tillers.8 So far, we have not considered lying, because our model separates credibility from telling the truth. 3.1 SCHUM Credibility refers only to the source’s ability to know the situation. Whether they are lying is a separate matter, tracked by HUMINT.tellingTruth, itself deter- mined in part by Source.veracity, which defines their In short, to be credible, a human report must be: com- general tendency to tell the truth (to us, anyway). petent, accurate, objective, and truthful. As Figure 12 suggests, Schum has broken each of these down The status of a Report reflects both credible and into further observables; our model allows users to add tellingTruth. The more credible the report is, the more such details, but does not yet require them. Although a lie will mislead. Conversely, a Report with credi- it is not required, these may perhaps most naturally ble=0 has no impact on status, regardless of whether be thought of as propensities or relative frequencies the source is lying: they’re simply not in a position to for accurate, objective, etc. reports. Indeed, in our know one way or another. model, each Report becomes evidence for the credibil- The Report properties credible and status are not Indi- ity of the Source. cators, and so cannot be further observed by Indicators and Reports.7 Nevertheless, it is possible to provide 3.2 SOURCES evidence for constituents like accurate and objective. Source Sensor Reference Agent Organization Person TerrorOrg Figure 13: Source hierarchy Figure 11: Fragment showing the key nodes for Joe’s Figure 13 shows the basic kinds of sources which we report suggesting that the fertilizer is OK. have defined so far. The main classes are Sensors, Ref- erences, and Agents. Figure 14 shows the credibility Figure 11 shows a fragment of the Bayes net with the attributes defined for the various kinds of sources. key nodes for Joe’s report claiming the fertilizer ship- ment is OK (not a threat). TellingTruth, threat, and As we saw in Figure 5, the HUMINT credibility model credible determine the report’s status (right), which is has a Naı̈ve Bayes structure: credibility is treated as an known. Opportunity, competent, objective, and accu- unknown common cause, and the attributes accuracy, rate determine credible (lower right). Not shown are objectivity, and competence are assumed to be linked Carl’s reports disparaging Joe’s credibility, nor Joe’s only via the hidden factor, at least until we make re- intrinsic credibility attributes. ports. The underlying variables are continuous [0..1], 7 8 The technical reason is that they are functions of more The diagram emerged from a discussion among Peter than one parent, and for now, Indicators indicate precisely Tillers, Tim van Gelder, and Dan Prager, on the Rationale one Hypothesis. “Google Group”. Figure 12: Schum’s credibility model as an argument map. (Peter Tillers) and each attribute is Beta distributed around credibil- ity.9 Source: Using a Beta distribution allows a smooth, flexible dis- credibility isa CH [0..1] tribution bounded on [0..1]. It can flex from U-shaped through flat to quite peaked. For example: Credibility Sensor isa Source: reliability isa CI [0..1] acc = β(0, 1, credibility, 3) FPrate isa CI [0..1] False True TPrate isa CI [0..1] Positive Positive Reliability where the range is [0..1], the mean is credibility, and Rate Rate the final parameter (here, 3) is, roughly, the steepness of the peak. Because we never observe credibility itself, Credibility Reference isa Source: the other values are correlated. accuracy isa CI [0..1] objectivity isa CI [0..1] Accuracy Objectivity Neither do we generally observe the attribute values. Doing so would render them insensitive to data in the form of the accuracy (etc.) of reports coming from Agent isa Source: Credibility that source. Instead, each report creates Boolean In- accuracy isa CI [0..1] objectivity isa CI [0..1] dicators of the attribute in question. This is a binomial competence isa CI [0..1] Accuracy Objectivity Competence model: the parameter (say accuracy) has a prior dis- tribution, and each Report gives us a Boolean value. Double We are, in essence determining the bias of a coin. Agent When we create a new source, we can make an ini- Person isa Agent: tial observation of its attributes. As shown in Figure Credibility Veracity veracity isa CI [0..1] 5, these initial observations are implemented as Con- tinuousIndicators, themselves Beta distributed around Accuracy Objectivity Competence the attribute value. Therefore, credibility begins its life with 3 observations. Every Report will generate Figure 14: Credibility models for various sources; CH more observations, especially as we discover whether = ContinuousHypothesis; CI = ContinuousIndicator 9 The values are shown discretized. Quiddity can do exact inference by discretization, or approximate inference by particle filters. Our visualizer works only with discrete nodes. the true status of the event in question. Furthermore, domain inference on the fly, and allow us to define like- we may uncover Indicators that speak directly to these lihood observations over runtime-composed domains. attributes, by performing a background check. For ex- ample, if Carl reports that Joe is habitually drunk, we Acknowledgements might apply that directly to Joe’s accuracy, at the very least! This work was funded by Naval Research Laboratory Contract N00173-06-C-4034. 4 CONCLUSION References We have developed a detailed, hierarchical Bayesian [Laskey, 2006] Laskey, K. B. (2006). Mebn: A logic for credibility model in the style of David Schum’s work. open-world probabilistic reasoning. Technical Re- Our low-level model allows very general control of port C4I06-01, George Mason University C4I Cen- continuous and discrete parameters, with many auxil- ter. iary nodes defining arbitrary relevance of indicators to hypotheses, using the relational power of multi- [Schum, 1994] Schum, D. A. (1994). Evidential Foun- entity Bayesian networks. Then we provide construc- dations of Probabilistic Reasoning. Wiley & Sons, tors that make various kinds of independence assump- hardback edition. Paperback 2001, Northwestern tions, for example, allowing one to use a single measure University Press. of “strength”, or mapping a continuous variable onto [Schum, 1997] Schum, D. A. (1997). Pedigree: Cred- a Boolean indicator without further parameters, etc. ibility design. IET Draft Report v1.0, plus revised Next, we hide the Bayes net behind a Wigmore dia- Appendix A. gram, stripping the view down to the essential flow of evidence among Reports and Hypotheses, hiding all [Wigmore, 1931] Wigmore, J. H. (1931). The Prin- the auxiliary machinery. Credibility models are built ciples of Judicial Proof, or the Process of Proof as on the same Hypothesis—Indicator—Report architec- given by Logic, Psychology, and General Experience ture, and can be inspected and augmented in the same and Illustrated in Judicial Trials. Little Brown & way as the scenario hypotheses. Finally, we subordi- Co., second edition. There are hardcovers in print nate the Wigmore diagrams to a map-based GUI. The of either the 1913, 1931, or 1937 editions. prototype uses the Google Maps API to show drag- and-drop functionality from reports to entities on the [Wright and Laskey, 2006] Wright, E. J. and Laskey, map, or to sources. K. B. (2006). Credibility models for multi-source fusion. In Proceedings of the 9th International Con- The existing system is only a prototype. The GUI ference on Information Fusion, Florence, Italy. does not yet support all of the features of the under- lying probabilistic model, and the Wigmore diagram component is still display-only. Neither can the GUI client get to the BN GUI, as we have yet to package the full visualization component into a web service. Sim- ilarly, there is no support for retracting or modifying existing observations. At a more fundamental level, the model itself does not yet make use of the dynamic Bayes net capabilities of the underlying engine, but it will have to: a deployed system would have to “roll up” observations older than some horizon. It may al- low a scrolling time window, but at no time would it be able to do inference on all the reports and events over all time. However, a great deal of work has gone into defining the basic Hypothesis–Source–Report architecture and the top-level HUMINT credibility model in a generic, extensible, and composable manner. The ubiquitous use of Hypotheses is only possible because of the object-oriented (or first-order logic) nature of multi- entity Bayesian networks, and relied on some advanced features of the underlying Quiddity engine to perform