<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Probabilistic Argument Maps for Intelligence Analysis: Completed Capabilities</article-title>
      </title-group>
      <contrib-group>
        <aff id="aff0">
          <label>0</label>
          <institution>Haystax Technology</institution>
          ,
          <addr-line>McLean, VA and Las Vegas, NV, Schrag</addr-line>
          ,
          <country country="US">USA</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Innovative Analytics and Training</institution>
          ,
          <addr-line>Washington, D.C</addr-line>
          ,
          <country country="US">USA</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Robert Schrag</institution>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2016</year>
      </pub-date>
      <fpage>34</fpage>
      <lpage>39</lpage>
      <abstract>
        <p>Intelligence analysts are tasked to produce wellreasoned, transparent arguments with justified likelihood assessments for plausible outcomes regarding past, present, or future situations. Traditional argument maps help to structure reasoning but afford no computational support for probabilistic judgments. We automatically generate Bayesian networks from argument map specifications to compute probabilities for every argument map node. Resulting analytical products are operational, in that (e.g.) analysts or their decision making customers can interactively explore different combinations of analytical assumptions.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        In intelligence analysis, argument mapping [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] presents a
problem-solving framework built around key elements of
the intelligence issue being addressed, makes analytic
reasoning shortfalls and information gaps more visible,
prompts consideration of both supporting and refuting
evidence mitigating confirmation bias [5], allows for
comparison of multiple hypotheses, and translates easily
into standard written formats with bottom line up front and
supporting reasoning organized logically.
      </p>
      <p>
        Haystax has developed a probabilistic argument mapping
framework called FUSION1. Faced with the challenge of
operationalizing subject matter experts’ (SMEs’)
policyguided reasoning about person trustworthiness in a
comprehensive risk model [
        <xref ref-type="bibr" rid="ref9">10</xref>
        ], we first developed
CARBON, now one of many models supported by the FUSION
framework. The CARBON domain’s high volume (hundreds)
of policy statements and need for SMEs both to understand
the model and to author its elements inspired us to develop
and apply a technical approach that enhances argument
maps with SME-accessible probabilistic reasoning.
      </p>
      <p>We developed the FUSION framework having recognized
the general need for and latent power of a probabilistic
argument mapping approach—across many application</p>
    </sec>
    <sec id="sec-2">
      <title>1 SMALL-CAPS typeface distinguishes tools and frameworks.</title>
      <p>
        areas, including our own software product and service line.
In the last three years of building FUSION, we have
identified and resolved subtle representation and reasoning
issues in a coherent, integrated computational framework
with APIs and UIs at multiple levels, including a top-level
GUI. We recently began addressing the specific
requirements of argumentation for intelligence analysis,
appealing initially as a driving use case to the CIA’s Iraq
retaliation scenario [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ], where Iraq might respond to US
forces’ bombing of its intelligence headquarters by
conducting major, minor, or no terror attacks, given limited
evidence about Saddam Hussein’s disposition and public
statements, Iraq’s historical responses, and the status of
Iraq’s national security apparatus.
      </p>
      <p>
        Intelligence analysts traditionally develop their judgments
about the likelihood of a given situation’s outcome using ad
hoc methods that consider probabilistic notions but do not
necessarily implement mathematically sound probabilistic
reasoning. Bayesian network inference propagates beliefs in
all directions—not just up from leaf nodes towards root
hypotheses, but also back down2, in a process that is
generally too complex for any human to follow, completely,
beyond small pedagogical examples. For a very large class
of intelligence analysis problems [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], this belief propagation
is very fast—much faster than needed to support graphical
user interface (GUI) interaction. Once propagation has
settled, observed probabilities are all consistent. Clicking
around an argument map FUSION model in the GUI, analysts
can observe which of their input likelihood assessments
have what effects on computed beliefs, for all nodes.
      </p>
      <p>
        FUSION models3 are intuitively simple yet technically
sophisticated. We have developed software [
        <xref ref-type="bibr" rid="ref12">13</xref>
        ] to convert
probabilistic argument maps into corresponding Bayesian
networks (BNs). The conversion software recognizes a
pattern of types of argument map links that are incident on a
given statement and constructs a conditional probability
table (CPT) for the corresponding BN node (a random
variable representing the statement’s truth or falsity) to
implement appropriate reasoning. The SME—here, the
      </p>
    </sec>
    <sec id="sec-3">
      <title>2 Note the finding set in Figure 2, e.g.</title>
      <p>3 A FUSION model is a probabilistic argument map (a
computational model of an analytical argument). We use the terms
“argument” and “model” interchangeably.
analyst—thus works with argument maps (as if on a
dashboard), and BN mechanics and minutiae4 all remain
conveniently “under the hood.”
2</p>
      <sec id="sec-3-1">
        <title>Analyst’s structured FUSION model</title>
        <p>
          Given the Iraq retaliation scenario description from [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ],
our intelligence analyst followed a structured argumentation
process drawing loosely on analysis of competing
hypotheses (ACH)5, to draft a purely textual argument. A
fragment appears as Table 1. The process Note the analyst’s
grouping of evidence statements into five categories—past
reactions, capability, initial responses, and political and
sychological motivations—which we take to be exemplary
of predictive intelligence questions as an analytical problem
class. Under the structured process, an analyst asserts first
hypotheses, then evidence statements (formulated as
hypothesis-neutral), then rates each evidence statement for
consistency with and relevance to each hypothesis.
        </p>
        <p>Hypothesis 2 --Iraq will sponsor some minor terrorist actions in
the Middle East—Refuting with High Uncertainty
Past reaction to similar events—Refuting with High uncertainty
•! Absence of terrorist offensive during the 1991 Gulf War—</p>
        <p>Refuting, Credibility High, Relevance Low
•! Iraq responded with low scale response to “provocations” by</p>
        <p>Iran—Supporting, Credibility High, Relevance Low
Capability to respond – military and intelligence capabilities—
Supporting with High Uncertainty
•! Small network of agents which could be used to attack US
interests in the Middle East and Europe—Supporting,
Credibility Medium, Relevance Low
•! Network has only been used to go after Iraqi dissidents—</p>
        <p>Refuting, Credibility Medium, Relevance Low
Initial responses to the bombing—Supporting with High
Uncertainty
•! Saddam public statement of intent not to retaliate—Refuting,</p>
        <p>
          Credibility Low; Relevance Low
•! Increase in frequency/length of monitored Iraqi agent radio
broadcasts—Supporting, Credibility Medium, Relevance Low
•! Iraqi embassies instructed to take increased security
precautions—Supporting, Credibility Medium, Relevance Low
4 The conversion software creates auxiliary BN nodes for some
link type patterns (e.g., MitigatedBy in [
          <xref ref-type="bibr" rid="ref12">13</xref>
          ]).
        </p>
        <p>
          5 ACH (see [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ], chapter 8) is intended to induce a workflow
enhancing the elicitation of hypotheses and evidence and to reduce
biases towards any particular lines of reasoning. It elicits informal
likelihoods but falls short of eliciting the conditional probabilities
that are essential to true Bayesian reasoning. While some ACH
tools do implement ad hoc likelihood combination methods, ACH
itself has no integral probabilistic framework.
        </p>
        <p>Political motivations driving response decision—Refuting with
High Uncertainty
•! Assumption that Saddam would not want to provoke another US
attack—Refuting, Validity Medium, Relevance Medium
Implication of Saddam’s psychological makeup for a decision on
responding—Supporting with High Uncertainty
•! Assumption: Failure to retaliate would be a loss of face for</p>
        <p>Saddam—Supporting, Validity Low, Relevance Low
In Table 1, likelihood reasoning is captured as follows.
•! Relevance captures the degree to which a posited
statement supports or refutes the (sub-)hypothesis
statement to which it is connected. Our analyst includes
explicit headings for the evidence categories.
•! Credibility captures the degree to which an evidence
statement is considered believable, based on attributes of
an associated source report.
•! Validity captures the analyst’s assessment of the
legitimacy of a posited assumption statement—one for
which sourced evidence is unavailable or unexpected.
•! Uncertainty captures the analyst’s (presumably, ad hoc)
roll-up accounting for the other three likelihood notions
above, respecting argument structure.</p>
        <p>Figure 1 is a screenshot6 of our encoding of the analyst’s
argument as a Fusion model, which includes outcome
hypothesis nodes (circled yellow), evidence category nodes
(circled green), and evidence nodes (right of category
nodes), plus additional nodes for the sake of logic (IraqRe.
tailiatesWithTerror) and organization (IraqChoosesTerror).
The former is true if either “TerrorAttacks” statement is
true. The latter collects support from the four category
nodes that in her model are the same for the two terror
hypotheses, using indication strengths per her specification.
For brevity, we’ve hidden all the evidence credibility and
assumption validity nodes. We’ve set appropriate findings
on all evidence, assumption credibility, and validity nodes.
Hypothesis 2 (minor terror) has a computed belief of 17%,
hypothesis 3 (major terror) 2%. By comparison, our analyst
estimated a belief range of 20–45% for hypothesis 2. The
traditional process rolls up likelihoods from evidence to
hypotheses, normalizing to 1.0 across hypotheses. In
contrast, Bayesian belief propagation is multi-directional,
updating beliefs over an entire model. A version of this
model addressing only Hypothesis 2 computes 23% belief—
within the analyst’s bounds.</p>
        <p>6 With the GUI, a user can edit a model to add, delete, or
change nodes or links, navigate to show or hide a displayed node’s
upstream, downstream, or neighbor nodes, find (per text search)
and display a hidden node, select either bottom-to-top or
right-toleft argument stream orientation, and explore different situations
by entering (or clearing) BN “hard findings” that arbitrarily (often
temporarily) state unequivocally that a given statement should be
taken either as true or false. Upon a finding entry, FUSION
performs BN belief propagation and updates the display.</p>
        <p>For each node, the modeler specifies a full-sentence statement
and chooses a short label for display on the node’s GUI icon. The
GUI will display the full statement on mouse-over or drill-down.
Or#summary
Downstream-!</p>
        <p>And#summary</p>
        <p>And#summary</p>
        <p>
          Note that the FUSION model is more compact than the full
textual specification. The model mentions each statement
only once. Besides uncluttering the modeling canvas, this
convention helps enforce consistency. Consider that given
assumption statements should carry the same truth values in
a fair comparison of different hypotheses. So, we shouldn’t
assess NoRetaliationForUSBombing with
IraqiAgentRadioChatter turned on and
IraqMinorTerrorAttacks with IraqiAgentRadioChatter
turned off. The GUI shows model state under one given set
of assumption values at one time. When a user changes
assumption values, computed beliefs displayed for all
statements (including outcomes) are updated together.
RelevantIf link (not a standard IndicatedBy link), which
serves to discount consideration of the promise when we
believe Saddam to be dishonest in making it.8
•! Temporal relevance, reflecting decay either in importance
of a past event or in continuing reliability of a past state
observation. In FUSION, an event’s/observation’s
relevance decays per a user-specified half life [
          <xref ref-type="bibr" rid="ref9">10</xref>
          ].
3
        </p>
        <p>Scientist’s incentive-oriented FUSION model
Figure 2 is a screenshot of a model by one of our
scientists emphasizing Saddam’s incentives to act,
considering the issues of maintaining diplomatic status,
maintaining a face of strength with his public, and whether
US might not expect retaliation (so harden defenses, likely
foiling any attack) if he promises none. By setting a hard
finding of false or true on the incentive-collecting node
SaddamWins, we can examine computed beliefs (plotted in
Figure 3) under Saddam’s worst- and best-case scenarios.
We see that Saddam is much more likely to have engaged in
terror in a situation in which he loses than one in which he
wins—so terror is not in his best interests. Figure 3 also
plots beliefs for the situation in which there is no finding
and the 50% prior probability on SaddamWins prevails.</p>
        <p>8 A FUSION MitigatedBy link works symmetrically, discounting
an influence when the mitigator is true.</p>
        <p>We believe FUSION’s combination of argument maps and
BNs to be unique.</p>
        <p>
          Karvetski et al. [
          <xref ref-type="bibr" rid="ref6">7</xref>
          ] propose BN expert-facilitated BN
development following an ACH-based protocol. The BN
adaptation is intended to overcome ACH weaknesses
associated with informal treatment of uncertainty.
Modeling the 1984 Rajneeshee bioterror attack themselves,
the authors envision how practicing analysts might
productively collaborate in an ACH style. Appealing to
standard elicitation techniques, they elicit (from each
supposed analyst) 118 coarse-grained probability
assessments to complete the CPTs for 14 nodes (one
ternary9, 13 binary) with a combined total of 19 parents.10
Like us, they eschew duplicate nodes (which unnecessarily
complicate probability reasoning). A corresponding FUSION
model would require no more than 27 indication
polarityand-strength assessments. We have designed FUSION to
eliminate the need for a knowledge representation and
reasoning specialist (a BN expert) to facilitate knowledge
acquisition, so that analysts can build argument models
themselves.
        </p>
        <p>9 A FUSION model would factor the ternary node into three
binary ones, over which it would apply an xOr@Logic constraint.</p>
        <p>10 They limit model size by factoring 12 outcome hypotheses
into three outcome aspects—who, where, and why. FUSION can
support this approach.
We agree with these authors’ statements below regarding
ACH uninformed by mathematically sound probability
reasoning. These statements also apply to argument maps so
uninformed.</p>
        <p>
          The measures of consistency, relevance, and credibility
are poorly defined and elicited unreliably. This allows for
highly subjective and unique interpretations among
analysts. For example, the consistency measure should
answer a well-defined question such as, “Given hypothesis
H, how likely are we to see evidence e?” rather than the
question “How consistent are hypothesis H and evidence
e?” Emphasizing the direction of the question can clear
up confusion between interpretations. (p213)
The Senate Select Committee on Intelligence [
          <xref ref-type="bibr" rid="ref11">12</xref>
          ]
criticized the pre-war assessments of weapons of mass
destruction (WMDs) in Iraq for the tendency of analysts to
consider uncertainty only at each separate stage of
reasoning rather than over the whole chain of reasoning.
Heuer [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ] was not unaware of this problem, but he
offered limited advice on the subject for ACH users. (p215)
Karvetski et al. acknowledge the value of argument
mapping as an elicitation tool, but do not go so far as to
integrate it with their product BN in an argument mapping
tool, as FUSION does.
        </p>
        <p>
          Probabilistic abstract argumentation frameworks [
          <xref ref-type="bibr" rid="ref7">8</xref>
          ]
assign probabilities to nodes and links locally and use these
to compute probabilities globally. These frameworks
generally assume conditional independence among all
nodes, so do not accommodate conditional probabilities and
cannot meaningfully capture causality or other rich
relationships.11
        </p>
        <p>
          Markov logic networks, similarly, do not naturally
accommodate conditional probabilities, so representing
causality is cumbersome [
          <xref ref-type="bibr" rid="ref5">6</xref>
          ]. They are notoriously hard to
build directly. More often, they are applied in a machine
learning setting. They are attractive in that the only
parameters to be specified are weights on logical formulas.
We implemented a propositional Markov logic interpreter to
experiment with the Iraq Retaliation scenario but were
unable to engineer the necessary fundamental conditional
dependence relationships (without going all the way to
implement BNs, less efficiently, in this framework).
11 We take the recent dissertation of Li [
          <xref ref-type="bibr" rid="ref7">8</xref>
          ] to be representative of
the state of the art. Li proposes framework extensions to
accommodate conditional independence—after having briefly mentioned
BNs, chooses Nilsson’s probabilistic logic [
          <xref ref-type="bibr" rid="ref8">9</xref>
          ] as a foil, and
dismisses the lot: The standard uncertainty management approaches
as mentioned are unable to propagate uncertainty through
argument evaluation; i.e., given uncertainty associated with arguments,
these approaches cannot propagate the uncertainty to uncertainty
about which arguments are justified. (p12) FUSION does this now.
        </p>
      </sec>
      <sec id="sec-3-2">
        <title>6 Conclusion</title>
        <p>
          Probabilistic argument maps are applicable wherever
traditional argument maps are. By choosing Logic statement
nodes and/or by applying hard findings to upstream-most
non-Logic nodes, a probabilistic argument map can be
rendered entirely deterministic. Thus, FUSION models are a
superset of standard argument maps. Probabilistic
reasoning offers a powerful alternative to crisp logical
reasoning, accounting naturally for uncertainty about
evidence or influences. FUSION also probabilistically
enhances nonmonotonic defeat and relevance reasoning—
via its MitigatedBy and RelevantIf link types. We continue
to develop and apply the FUSION framework [
          <xref ref-type="bibr" rid="ref10">11</xref>
          ].
        </p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          <article-title>[1]! CIA Directorate of Intelligence, “A Tradecraft Primer: The Use of Argument Mapping,” Tradecraft Review 3(1), Kent Center for Analytic Tradecraft</article-title>
          ,
          <source>Sherman Kent School</source>
          ,
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]! Kevin Burns, “
          <string-name>
            <surname>Bayesian</surname>
            <given-names>HELP</given-names>
          </string-name>
          : Assisting Inferences in AllSource Intelligence,” Cognitive Assistance in Government,
          <source>Papers from the AAAI 2015 Fall Symposium</source>
          ,
          <fpage>7</fpage>
          -
          <lpage>13</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]! Kevin Burns, “
          <article-title>Integrated Cognitive-neuroscience Architectures for Understanding Sensemaking (ICArUS): Phase 2 Challenge Problem Design and Test Specification,”</article-title>
          <source>MITRE Technical Report, MTR 149412</source>
          ,
          <string-name>
            <surname>McLean</surname>
            ,
            <given-names>VA</given-names>
          </string-name>
          ,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <surname>! Richards J. Heuer</surname>
          </string-name>
          , Jr.,
          <source>Psychology of Intelligence Analysis, Central Intelligence Agency Historical Document</source>
          . https://www.cia.gov/library/center
          <article-title>-for-the-study-ofintelligence/csi-publications/books-andmonographs/psychology-of-intelligence-analysis (</article-title>
          <source>Posted: Mar</source>
          <volume>16</volume>
          ,
          <year>2007</year>
          01:52 PM. Last Updated: Jun 26,
          <year>2013</year>
          08:05 AM.) [5]
          <string-name>
            <given-names>! P. E.</given-names>
            <surname>Lehner</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Adelman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B. A.</given-names>
            <surname>Cheikes</surname>
          </string-name>
          , and
          <string-name>
            <surname>M. J. Brown</surname>
          </string-name>
          , “
          <article-title>Confirmation bias in complex analyses</article-title>
          ,
          <source>” IEEE Transactions on Systems, Man and Cybernetics</source>
          ,
          <string-name>
            <surname>Part</surname>
            <given-names>A</given-names>
          </string-name>
          : Systems and Humans,
          <volume>38</volume>
          (
          <issue>3</issue>
          ),
          <fpage>584</fpage>
          -
          <lpage>592</lpage>
          ,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [6]! Dominik Jain, “
          <article-title>Knowledge Engineering with Markov Logic Networks: A Review,”</article-title>
          <source>in DKB 2011: Proceedings of the Third Workshop on Dynamics of Knowledge and Belief</source>
          ,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>! Christopher W.</given-names>
            <surname>Karvetski</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Kenneth C.</given-names>
            <surname>Olson</surname>
          </string-name>
          , Donald T. Gantz, and
          <string-name>
            <surname>Glenn</surname>
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Cross</surname>
          </string-name>
          , “
          <article-title>Structuring and analyzing competing hypotheses with Bayesian networks for intelligence analysis</article-title>
          ,”
          <string-name>
            <surname>EURO J Decis Process</surname>
          </string-name>
          (
          <year>2013</year>
          )
          <volume>1</volume>
          :
          <fpage>205</fpage>
          -
          <lpage>231</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>! Hengfei</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <article-title>Probabilistic Argumentation, A dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy of</article-title>
          the University of Aberdeen Department of Computing Science,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [9]
          <string-name>
            <surname>! Nils J. Nilsson</surname>
          </string-name>
          , “Probabilistic Logic,” Artificial intelligence,
          <volume>28</volume>
          (
          <issue>1</issue>
          ):
          <fpage>71</fpage>
          -
          <lpage>87</lpage>
          ,
          <year>1986</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [10]!Robert Schrag, Edward Wright, Robert Kerr, and Bryan Ware,
          <source>“Processing Events in Probabilistic Risk Assessment,” 9th International Conference on Semantic Technologies for Intelligence</source>
          , Defense, and
          <string-name>
            <surname>Security</surname>
          </string-name>
          (STIDS),
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [11]!Robert Schrag, Edward Wright, Robert Kerr, Robert Johnson, Bryan Ware,
          <string-name>
            <surname>Joan</surname>
            <given-names>McIntyre</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>Melonie</given-names>
            <surname>Richey</surname>
          </string-name>
          , Kathryn Laskey, and Robert Hoffman, “
          <article-title>Probabilistic Argument Maps for Intelligence Analysis: Capabilities Underway</article-title>
          ,
          <source>” 16th Workshop on Computational Models of Natural Argument</source>
          ,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          <source>[12]!United States Senate Select Committee on Intelligence, “Report on the U.S. Intelligence</source>
          , Community's Prewar Intelligence Assessments on Iraq,” One Hundred Eighth Congress, Second Session. U.S. Government Printing Office, Washington, DC,
          <year>2004</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [13]!Edward Wright, Robert Schrag, Robert Kerr, and Bryan Ware, “
          <article-title>Automating the Construction of Indicator-Hypothesis Bayesian Networks from Qualitative Specifications,”</article-title>
          <source>Haystax Technology technical report</source>
          ,
          <year>2015</year>
          , https://labs.haystax.com/wpcontent/uploads/2016/06/BMAW15-160303
          <article-title>-update</article-title>
          .
          <source>pdf.</source>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>