Credibility Models


           Charles R. Twardy, Edward J. Wright, Stephen J. Canon & Masami Takikawa∗
                              Information Extraction & Transport, Inc.
                                  1911 N. Fort Myer Dr., Suite 600
                                        Arlington, VA 20190
                          {ctwardy,ewright,scannon,mtakikawa}@iet.com
                                           Revision : 1.6


                      Abstract

    We present a general hierarchical Bayesian
    model where Intelligence Sources make Re-
    ports about events or states in the world,
    which we call Hypotheses. The underlying
    multi-entity Bayes net for even a simple sce-
    nario has hundreds of nodes. We hide the
    details via Wigmore diagrams and a Google
    Maps GUI. Our application domain is Intel-
    ligence data fusion in asymmetrical warfare
    (terrorism). Some Hypotheses – like whether
    a village is a threat – may be abstract or un-
    observable. For these, we define Indicators –
    more observable Hypotheses whose value has
    some bearing on the target Hypothesis. The
    hierarchy can be arbitrarily deep, and Re-
    ports can provide evidence at any level. Fur-
    thermore, all Sources have credibility models.
    Traditional Sources are physical sensors with
    well-known error models. Non-traditional
    Sources include humans, websites, news, etc.
                                                             Figure 1: Initial screenshot showing the map window,
    For these Sources, our credibility models in-
                                                             the Sources dock, and incoming evidence. The hypo-
    clude Hypotheses about unknown factors like
                                                             thetical village of Kafur is shown with the tractor logo,
    objectivity, competence, accuracy, reliabil-
                                                             in the upper left. The screenshot has been altered to
    ity, and veracity. Every Report by a Source
                                                             fit.
    provides evidence about those factors. So,
    for example, successful ad hominem attacks
    against one Source can undermine his assur-
    ances that a village is safe, and lead us to
    believe it is hostile after all.                         and against various hypotheses. We do that via a com-
                                                             plex, hierarchical Bayesian network. However, because
                                                             Intelligence Analysts do not wish to encounter the full
1   INTRODUCTION                                             computational complexity of the resulting multi-entity
                                                             model, we provide simplified views via Wigmore dia-
Our domain is structured evidential reasoning, espe-         grams centered on specific hypotheses. Even so, there
cially Intelligence analysis. Our task is to reason sci-     are too many of these, so we attach them to specific ac-
entifically about the credibility of Intelligence sources,   tors in the world. These actors are presented in a map-
so that we may properly weigh conflicting evidence for       based GUI like that shown in Figure 1. In our model,
    ∗
    Now with Cleverset, Inc., Dr. Takikawa can also be       there are three main classes: Hypotheses, Sources, and
reached at takikawa@cleverset.com.                           Reports. When capitalized, these refer to classes or
objects in our model.1
We begin with a use case involving a fictional farm-
ing and fishing village named Kafur, a few Intelligence
sources, and reports from those sources. This morn-
ing, electronic Signals Intelligence (SIGINT) reported
that Kafur received a suspiciously large shipment of
fertilizer. We are concerned that it may be used for
explosives. Figure 1 shows our initial state. The belief
bar under Kafur shows its perceived threat level – a de-
fault 50% for an unknown place. Our Intelligence en-
cyclopedia reports that Kafur’s allegiance is 80% Blue        Figure 2: Wigmore diagram showing lines of evidence
– which means favorable to us. We drag that Report            for Kafur.threat (top), after Joe’s HUMINT report. For
from Incoming Evidence onto Kafur’s icon. A dialog            clarity, we have added short names in large bold, and
box lets us confirm the Source, set the precise Hypoth-       italicized the unique ID’s.
esis to Kafur.allegiance, and the value to “Blue”. En-

cyclopedia has medium credibility                  ,
enough to move Kafur’s threat level comfortably into

the green                    .
Now we apply this morning’s SIGINT Report from
“Sensor 1” onto Kafur. For dramatic effect, we con-
sider it direct evidence that Kafur is a threat, so we
apply it to Kafur.threat, with the value True. Because
Sensor 1 is a reliable source          , Kafur’s threat
                                                              Figure 3: Wigmore diagram for SIGINT credibility,
                                                              after filing one Report. (Some text rearranged)
probability moves into the red zone.
We begin talking to Sources.
Joe reports that the fertilizer is in fact going to farmers   Joe) track objectivity, competence, accuracy, and ve-
who live outside the village. This drops the threat           racity, following [Schum, 1997].3 Joe’s initial credibil-
probability into the yellow zone, but Joe is a relatively     ity model is shown in Figures 4 and 5.
new source. We’re waiting to hear from our agent
Carl. While we wait, we view the Wigmore2 diagram
showing the lines of evidence for Kafur.threat. (See
Figure 2.)
Reports from Joe and SIGINT directly influence Ka-
fur.threat, while the Encyclopedia report contributes
via Allegiance. Different sources, in turn, may have
different credibility models. For example, the credibil-
ity model for SIGINT looks like Figure 3. The sensor
attributes TPrate, FPrate, and Reliability (second row)
determine probabilities for Boolean report attributes         Figure 4: Wigmore diagram of Joe’s initial credibility
TP, FP, and Working, respectively (third row). These          model. (Some text rearranged)
in turn influence the report’s status (the bottom red
circle).                                                      At this point, Carl reports two things. First, that he
                                                              has independent reason to think Joe is lying about the
The credibility models for HUMINT sources (like
                                                              fertilizer. Depending on whether we take Carl to mean
   1                                                          Joe is lying just on this occasion or habitually, we can
      We capitalize Intelligence to distinguish the govern-
ment activity from the human attribute. Attributes appear     apply this to Joe’s report (via the tellingTruth) node,
like so: accuracy.                                            or directly to Joe’s credibility model (via the veracity
    2                                                         node). Here we take the second (more serious) case
      In a Wigmore diagram, arrows show the direction of
inference, from evidence to hypothesis. [Wigmore, 1931]
                                                                 3
BN arrows are often in the opposite direction.                       We apply them concurrently; Schum uses a chain.
Figure 5: The BN fragment for Joe’s initial credibility


– Carl isn’t just saying the coin came up tails. He’s
saying it’s weighted to favor tails. Because Carl has
a very high credibility, his report casts serious doubt
on Joe’s report. Consequently, Kafur’s threat level
increases        .
Carl also says that Joe has been attending several sub-
versive meetings. We might take this as evidence that
Joe is a DoubleAgent, but instead we apply it to his
objectivity, which affects his credibility. Again, we drag
Carl’s Report directly onto Joe. By now, Joe’s cred-          Figure 7: BN fragment for Joe’s final credibility model
ibility is down to about 50%              . The Wigmore
diagram (Figure 6) shows all the evidence influencing
Joe’s credibility. Kafur’s threat level is still in the red
                                                              2       MODEL STRUCTURE
zone, though down slightly.
                                                              Fundamentally, we have a system for reasoning from
                                                              observations to hypotheses, accounting for the cred-
                                                              ibility of sources and the relevance of observations
                                                              for different hypotheses. It extends and generalizes
                                                              [Wright and Laskey, 2006] by making report credibil-
                                                              ities derive from source credibilities, and by defining
                                                              general multi-entity Bayesian network (MEBN) frag-
                                                              ments4 that serve for both the scenario and the credi-
                                                              bility model. There are three main classes: Hypothe-
                                                              ses, Sources, and Reports.
Figure 6: Wigmore view of Joe’s final credibility model
                                                              Each class includes several Bayesian network nodes,
                                                              and references other classes. For example, Reports
Compare that Wigmore diagram (Figure 6) to the un-            necessarily have a Source and a target Hypothesis.
derlying Bayes net fragment, shown in Figure 7. As            (These typed links give a MEBN system the expres-
we connect reports, the BN becomes complicated. And           sive power of first-order logic, and thereby support dy-
recall that this is just the fragment dedicated to Joe’s      namic construction of the network.)
credibility, which is not our main concern. The entire        One key to our model’s success is that nearly ev-
BN has several hundred nodes, as shown in Figure 8.           ery attribute is a Hypothesis, and therefore nearly
As we discuss in Section 2, the model needs all that          every claim can become the center of evidence and
machinery to “do the right thing”, but the Wigmore            reasoning – the top node in a Wigmore diagram.
diagrams provide a much more accessible view of the           We were especially interested in being able to reason
underlying model. The analyst is seldom concerned
                                                                  4
with, nor prepared to encounter, the full machinery.                  For MEBNs and fragments, see [Laskey, 2006].
Figure 8: The whole model view: each box is a frame, with one or more BN nodes inside. This frame view and
the BN views come from our network visualizer.


about the credibility of Reports and Sources. There-      A quick word about our modeling language. The
fore, key attributes such as Report.opportunity and       models are defined in in Quiddity*Script (Q*S),
Source.accuracy are themselves Hypotheses, for which      IET’s own modeling language for MEBNs. Although
we can define further Indicators, and to which we can     Q*S has traditional classes, it uses a frame system
apply evidence (i.e. Reports).                            for MEBN fragments: fragments are “frames” and BN
                                                          nodes are one kind of “slot” that can inhabit a frame.
                                                          So for example5 :
2.1   HYPOTHESES
                                                          frame Hypothesis
                        Hypothesis                          slot status
                                                              domain = Object
                                                              distribution = UniformDiscreteDistribution

                Continuous
                                 Indicator
                                                          By this definition, every Hypothesis (and descendant)
                Hypothesis                                has a status node. Because status has a distribution,
                                                          it will become a BN node. For example, Kafur.threat
                                                          is a default Boolean Hypothesis. Subclasses may rede-
                Continuous
                                                          fine status. For example, ContinuousHypothesis sets
                 Indicator                                domain = Continuous and distribution to a func-
                                                          tion defined in a fn slot.

           Figure 9: Hypothesis hierarchy
                                                          2.1.1   Relevances
                                                          6
In our scenario, the main hypothesis was whether the        Now, often we cannot apply evidence directly to
village of Kafur posed a threat. In our system, this      our main Hypothesis. Instead, we define Indicators.
Hypothesis was one of the predefined attributes of all    For example, Kafur.Allegiance is an Indicator for Ka-
entities, Entity.threat. Insofar as possible, every ob-       5
                                                              The examples clean up the syntax for presentation.
servable is a Hypothesis, or subclass thereof. There-         6
                                                              This section supplies technical detail that may be
fore, everything can be observed and reasoned about,      skipped. It shows how we have made Hypotheses as general
including key attributes of our credibility model.        as they are.
fur.threat. We might define others, such as poverty         tions, and fill in the various R tables for us. In many
level, economic instability, youth bulge, presence of       cases, we can define our relevance with a single num-
militias or weapons, etc. Similarly, we might argue         ber – a notional strength for the Indicator, even for
for the credibility of a sensor by reference to sensitiv-   discrete CPTs. For example, when H and I have the
ity, specificity, and reliability. An Indicator need not    same domain, we can use a single number to represent
have the same domain as the Hypothesis it indicates,        Pr(I = x|H = x), assuming the remainder is spread
though it may. Later we will see how we use Indicators      uniformly among the other states.
in our agent credibility model.
Some indicators are better than others, so we need          2.2   SOURCES
a measure of strength. We call this the Relevance.
                                                            We spend more time on Sources when discussing credi-
Then, Indicator I is a function of Hypothesis H and
                                                            bility models in Section 3. Now, it is sufficient to know
Relevance R: I = f (H, R). For example, if I is nor-
                                                            that all sources have a credibility, which is a Contin-
mally distributed around H, with a standard deviation
                                                            uousHypothesis ranging from [0..1], with 1 being per-
given by R, we would say I = N (H, R). Graphically,
                                                            fectly credible. Specific kinds of sources have specific
H → I ← R. This could even be I = H + R, where
                                                            attributes that determine the overall credibility. These
R = N (0, x). But unless we need to vary the vari-
                                                            attributes are themselves Hypotheses, usually Contin-
ance (etc.) on the fly, we do not need actually need
                                                            uousHypotheses. All agents have accuracy, objectiv-
to create separate nodes for R (So far, our continuous
                                                            ity, and competence. In addition, since people can lie,
indicators have not needed to do so.)
                                                            sources of type Person have veracity, which is a func-
However, in the general discrete case, the indicator        tion of whether they are a DoubleAgent.
may well have a different CPT for each hypothesis
state. We need to be able to use the proper one de-         2.3   REPORTS
pending on our current beliefs about the hypothesis.
We also wish to define a single class regardless how                                   Source
many states our Indicator and Hypothesis have. We
can do so using reference uncertainty. Our Relevance
R is actually a set of nodes – one per hypothesis state                    Event
                                                                                                Credible   Opportunity
– defining the the indicator’s CPT for that hypothesis                  (Hypothesis)

state.

frame Indicator isa Hypothesis                                                         Report

  slot hypothesis
    domain = Hypothesis                                                   Figure 10: Report schema
  slot relevance
    domain = Relevance                                      Every Report has opportunity and credible Boolean at-
    parents = [hypothesis.status,                           tributes. A generic Report does not know what kind
       relevance.hypothesisValue] # h, r                    of source it has, so its credible is a direct function of
    distribution = function h, r                            the source’s credibility, as well as opportunity:
       { if h == r then 1 else 0 end }
  slot status                                               slot credible
    # domain is inferred                                      parents = [source.credibility.status,
    parents = [relevance.status]                                             opportunity.status]
    distribution = function r { r }                           domain = BooleanDomain
                                                              distribution = function cred, opp {
During inference, the correct r ∈ R is chosen by the            if opp == false then return [1, 0];
distribution:                                                   else {
                                                                  c = cred->getMid();
  function h, r {if h == r then 1 else 0 }
                                                                  return [1-c, c];    # [false, true]
In effect, we change the CPT of status on the fly, ac-          };
cording to our beliefs about the current hypothesis.            end
                                                              }
Having defined a structure that can handle such a
general case, we next seek to avoid having to fill in       In contrast, a HUMINT report (see Figure 11) knows
all those tables. So we have many special-case con-         that the source is a Person, and defines the additional
structors which make stronger independence assump-          properties accurate, objective, and competent. Then,
credible is a function of all these. (For now we simply       3       CREDIBILITY MODELS
AND them together.)
                                                              The underlying credibility model tells us how much
slot credible                                                 to believe the claim, when a source makes a re-
  parents = [opportunity.status,                              port. With electronic sensors reporting on known
     competent.status, objective.status,                      events in known conditions, we usually have some
     accurate.status]                                         information on accuracy, false positive rate, and of
  domain = BooleanDomain                                      course, reliability. For human observers, David Schum
  distribution = function opp, comp, obj, acc                 [Schum, 1997, Schum, 1994] has created a detailed
    { return opp && comp && obj && acc; }                     credibility model, which is summarized in Figure 12
                                                              by Peter Tillers.8
So far, we have not considered lying, because our
model separates credibility from telling the truth.
                                                              3.1     SCHUM
Credibility refers only to the source’s ability to know
the situation. Whether they are lying is a separate
matter, tracked by HUMINT.tellingTruth, itself deter-
mined in part by Source.veracity, which defines their         In short, to be credible, a human report must be: com-
general tendency to tell the truth (to us, anyway).           petent, accurate, objective, and truthful. As Figure
                                                              12 suggests, Schum has broken each of these down
The status of a Report reflects both credible and
                                                              into further observables; our model allows users to add
tellingTruth. The more credible the report is, the more
                                                              such details, but does not yet require them. Although
a lie will mislead. Conversely, a Report with credi-
                                                              it is not required, these may perhaps most naturally
ble=0 has no impact on status, regardless of whether
                                                              be thought of as propensities or relative frequencies
the source is lying: they’re simply not in a position to
                                                              for accurate, objective, etc. reports. Indeed, in our
know one way or another.
                                                              model, each Report becomes evidence for the credibil-
The Report properties credible and status are not Indi-       ity of the Source.
cators, and so cannot be further observed by Indicators
and Reports.7 Nevertheless, it is possible to provide         3.2     SOURCES
evidence for constituents like accurate and objective.
                                                                                     Source


                                                                          Sensor    Reference         Agent


                                                                                         Organization         Person


                                                                                          TerrorOrg


                                                                            Figure 13: Source hierarchy
Figure 11: Fragment showing the key nodes for Joe’s
                                                              Figure 13 shows the basic kinds of sources which we
report suggesting that the fertilizer is OK.
                                                              have defined so far. The main classes are Sensors, Ref-
                                                              erences, and Agents. Figure 14 shows the credibility
Figure 11 shows a fragment of the Bayes net with the
                                                              attributes defined for the various kinds of sources.
key nodes for Joe’s report claiming the fertilizer ship-
ment is OK (not a threat). TellingTruth, threat, and          As we saw in Figure 5, the HUMINT credibility model
credible determine the report’s status (right), which is      has a Naı̈ve Bayes structure: credibility is treated as an
known. Opportunity, competent, objective, and accu-           unknown common cause, and the attributes accuracy,
rate determine credible (lower right). Not shown are          objectivity, and competence are assumed to be linked
Carl’s reports disparaging Joe’s credibility, nor Joe’s       only via the hidden factor, at least until we make re-
intrinsic credibility attributes.                             ports. The underlying variables are continuous [0..1],
   7                                                              8
    The technical reason is that they are functions of more        The diagram emerged from a discussion among Peter
than one parent, and for now, Indicators indicate precisely   Tillers, Tim van Gelder, and Dan Prager, on the Rationale
one Hypothesis.                                               “Google Group”.
                        Figure 12: Schum’s credibility model as an argument map. (Peter Tillers)


                                                                               and each attribute is Beta distributed around credibil-
                                                                               ity.9

Source:                                                                        Using a Beta distribution allows a smooth, flexible dis-
   credibility isa CH [0..1]                                                   tribution bounded on [0..1]. It can flex from U-shaped
                                                                               through flat to quite peaked. For example:
                                           Credibility
Sensor isa Source:
   reliability isa CI [0..1]                                                                   acc = β(0, 1, credibility, 3)
   FPrate isa CI [0..1]          False       True
   TPrate isa CI [0..1]         Positive    Positive     Reliability           where the range is [0..1], the mean is credibility, and
                                 Rate        Rate
                                                                               the final parameter (here, 3) is, roughly, the steepness
                                                                               of the peak. Because we never observe credibility itself,
                                  Credibility
Reference isa Source:                                                          the other values are correlated.
   accuracy isa CI [0..1]
   objectivity isa CI [0..1] Accuracy Objectivity                              Neither do we generally observe the attribute values.
                                                                               Doing so would render them insensitive to data in the
                                                                               form of the accuracy (etc.) of reports coming from
Agent isa Source:                                 Credibility
                                                                               that source. Instead, each report creates Boolean In-
   accuracy isa CI [0..1]
   objectivity isa CI [0..1]                                                   dicators of the attribute in question. This is a binomial
   competence isa CI [0..1] Accuracy              Objectivity     Competence   model: the parameter (say accuracy) has a prior dis-
                                                                               tribution, and each Report gives us a Boolean value.
                                                           Double              We are, in essence determining the bias of a coin.
                                                           Agent
                                                                               When we create a new source, we can make an ini-
Person isa Agent:                                                              tial observation of its attributes. As shown in Figure
                                           Credibility    Veracity
   veracity isa CI [0..1]                                                      5, these initial observations are implemented as Con-
                                                                               tinuousIndicators, themselves Beta distributed around
                             Accuracy      Objectivity    Competence           the attribute value. Therefore, credibility begins its
                                                                               life with 3 observations. Every Report will generate
Figure 14: Credibility models for various sources; CH                          more observations, especially as we discover whether
= ContinuousHypothesis; CI = ContinuousIndicator                                  9
                                                                                    The values are shown discretized. Quiddity can do
                                                                               exact inference by discretization, or approximate inference
                                                                               by particle filters. Our visualizer works only with discrete
                                                                               nodes.
the true status of the event in question. Furthermore,     domain inference on the fly, and allow us to define like-
we may uncover Indicators that speak directly to these     lihood observations over runtime-composed domains.
attributes, by performing a background check. For ex-
ample, if Carl reports that Joe is habitually drunk, we    Acknowledgements
might apply that directly to Joe’s accuracy, at the very
least!                                                     This work was funded by Naval Research Laboratory
                                                           Contract N00173-06-C-4034.

4   CONCLUSION                                             References
We have developed a detailed, hierarchical Bayesian        [Laskey, 2006] Laskey, K. B. (2006). Mebn: A logic for
credibility model in the style of David Schum’s work.        open-world probabilistic reasoning. Technical Re-
Our low-level model allows very general control of           port C4I06-01, George Mason University C4I Cen-
continuous and discrete parameters, with many auxil-         ter.
iary nodes defining arbitrary relevance of indicators
to hypotheses, using the relational power of multi-        [Schum, 1994] Schum, D. A. (1994). Evidential Foun-
entity Bayesian networks. Then we provide construc-           dations of Probabilistic Reasoning. Wiley & Sons,
tors that make various kinds of independence assump-          hardback edition. Paperback 2001, Northwestern
tions, for example, allowing one to use a single measure      University Press.
of “strength”, or mapping a continuous variable onto
                                                           [Schum, 1997] Schum, D. A. (1997). Pedigree: Cred-
a Boolean indicator without further parameters, etc.
                                                              ibility design. IET Draft Report v1.0, plus revised
Next, we hide the Bayes net behind a Wigmore dia-
                                                              Appendix A.
gram, stripping the view down to the essential flow of
evidence among Reports and Hypotheses, hiding all          [Wigmore, 1931] Wigmore, J. H. (1931). The Prin-
the auxiliary machinery. Credibility models are built        ciples of Judicial Proof, or the Process of Proof as
on the same Hypothesis—Indicator—Report architec-            given by Logic, Psychology, and General Experience
ture, and can be inspected and augmented in the same         and Illustrated in Judicial Trials. Little Brown &
way as the scenario hypotheses. Finally, we subordi-         Co., second edition. There are hardcovers in print
nate the Wigmore diagrams to a map-based GUI. The            of either the 1913, 1931, or 1937 editions.
prototype uses the Google Maps API to show drag-
and-drop functionality from reports to entities on the     [Wright and Laskey, 2006] Wright, E. J. and Laskey,
map, or to sources.                                          K. B. (2006). Credibility models for multi-source
                                                             fusion. In Proceedings of the 9th International Con-
The existing system is only a prototype. The GUI             ference on Information Fusion, Florence, Italy.
does not yet support all of the features of the under-
lying probabilistic model, and the Wigmore diagram
component is still display-only. Neither can the GUI
client get to the BN GUI, as we have yet to package the
full visualization component into a web service. Sim-
ilarly, there is no support for retracting or modifying
existing observations. At a more fundamental level,
the model itself does not yet make use of the dynamic
Bayes net capabilities of the underlying engine, but it
will have to: a deployed system would have to “roll
up” observations older than some horizon. It may al-
low a scrolling time window, but at no time would it
be able to do inference on all the reports and events
over all time.
However, a great deal of work has gone into defining
the basic Hypothesis–Source–Report architecture and
the top-level HUMINT credibility model in a generic,
extensible, and composable manner. The ubiquitous
use of Hypotheses is only possible because of the
object-oriented (or first-order logic) nature of multi-
entity Bayesian networks, and relied on some advanced
features of the underlying Quiddity engine to perform