1 Contributions to a Semantically Based Intelligence Analysis Enterprise Workflow System Robert C. Schrag, Jon Pastor, Chris Long, Eric Peterson, Mark Cornwell, Lance A. Forbes, and Stephen Cannon To support the rapidly changing needs of an intelligence Abstract—We have contributed key elements of a semantically enterprise, a workflow authoring tool must be extremely based intelligence analysis enterprise workflow architecture: a flexible. The enterprise must be able to rearrange components uniformly accessible semantic store conforming to an enterprise- (e.g., pattern matchers, classifiers, group detectors) in the same wide ontology; a branching context representation to organize workflow components’ analytical hypotheses; a logic kind of way that a child rearranges Lego bricks. They must be programming-based, forward-chaining query language for able to introduce new software into the enterprise rapidly. components to access data from the store; and a software toolkit However, Lego bricks have a distinct advantage over legacy embracing all the foregoing to streamline the process of software components from different source: they were all introducing additional legacy software components as created to respect a common interface. One brute-force semantically interoperable workflow building blocks. approach to integrating legacy components is to manually We explain these contributions, focusing particularly on the develop code that transforms data from one form (e.g., Java toolkit. For certain widely used input/output formats—e.g., comma-separated value (CSV) files—a knowledgeable user can objects) to another (e.g., flat files); that requires O(n2) quickly “wrap” a newly installed component for workflow transforms. Tangram’s approach reduces the required number operation by providing a compact and entirely declarative of transforms to O(n), and our toolkit enables knowledgeable specification that uses the query language to map specific relation users to “wrap” legacy components with such transforms, arguments in the ontology to specific structural elements in the making the components workflow-ready quickly. component’s native input and output formats. To motivate our contributions, we present the (notional, Our contributions are built to work with AllegroGraph, from Franz, Inc. simplified) two-component workflow in Fig. 1: a suspicion scorer hypothesizes potential terrorists, then a group detector Index Terms—Intelligence analysis, enterprise workflow, clusters the hypothesized terrorists into hypothesized potential hypothesis representation, branching contexts, semantic terrorist groups. interoperability, declarative data transformation, software Suspicion Scoring Component Group Detection Component component wrapping Fig. 1 A notional intelligence analysis workflow I. INTRODUCTION The workflow in Fig. 1 raises some enterprise-level architecture issues that our contributions address. WE have contributed key elements of a semantically based 1) What are components’ input and output data, how is data intelligence analysis enterprise workflow architecture for stored, and how do components access it? We have Tangram, a multi-year, multi-contractor threat surveillance and introduced a uniformly accessible semantic store alerting research and development program sponsored by the conforming to an enterprise-wide ontology and a logic United States’ Intelligence Advanced Research Projects programming-based, forward-chaining query language for Agency (IARPA). Tangram’s objective has been to automate components to access data from the store. Component routine analysis workflows, so that these can be executed as specifications (see Issue 3 below) indicate what data is standing processes, on a large scale. accessed in particular. 2) How are the hypotheses that analytical components Manuscript submitted August 19, 2009. This work was supported in part produce distinguished from background data, and how are by the U.S. Government. they communicated among components? As hypotheses, All authors were with Global InfoTek, Inc., 1920 Association Dr, Suite analytical components’ outputs must not simply be mixed 600, Reston, VA USA 20191, 703-652-1600, (e-mail: firstinitialLastname@globalinfotek.com). indiscriminately with more uniformly credible evidence C. Long is now with SET Corp., Arlington, VA, 703-738-6214 (email: data or with each other. Among other considerations, the clong@setcorp.com). broad body of evidence changes over time (leading to L. A. Forbes is now with Solutions Made Simple, Inc., Reston, VA (email: lforbes@sms-fed.com). different hypotheses), and different components—or 2 different (e.g., control) configurations thereof can lead to 3) Invoke the legacy component in its “native” (unwrapped) different hypotheses even for the same inputs. We form. organize the content of the semantic store into distinct 4) Convert the legacy component’s native-format outputs to RDF graphs that we call “datasets,” and (correlating the common ontology, as metadata-bearing hypotheses. datasets with contexts) represent the outputs of 5) Assert the output hypotheses to the central store. successively applied analytical components as branching We implement the central semantic store using contexts (that incrementally add information). Our AllegroGraph from Franz, Inc. AllegroGraph is a “quad” store component specifications and our query language thus that includes, in addition to the “subject,” “predicate,” and include parameters for the datasets that are passed among “object” fields standard to RDF and common to triple stores, a or otherwise accessed by components. Besides these “graph” field. We use this field to distinguish among the datasets for hypotheses, the store includes one or more various datasets that are available as inputs or have been background, or “evidence,” datasets and for convenience produced as outputs of workflow components. some intermediate (i.e., not necessarily hypothetical) We provide a knowledge base (KB) query language datasets that result from purely logical queries. This supporting a wrapped component’s query and assertion treatment of evidence and hypotheses, together with the processes and allowing users to define, for specific analytical above-mentioned query language, provide a practical purposes, KB query components (including no legacy process) implemented solution to meet broad Tangram that combine elements from one or more existing datasets into requirements outlined in [6]. one or more output datasets. We implement legacy component 3) How can legacy components with arbitrary input/output wrappers and KB query components using the Prolog and formats easily be made to interact with the data? The Common Lisp interfaces to AllegroGraph. contributions above are integrated in a software toolkit to Fig. 3 illustrates the meta-data classes (noted in bold) and streamline the process of introducing additional legacy attributes (with multi-valued attributes starred*) that support software components as semantically interoperable the representation of a dataset’s context lineage. We take each workflow building blocks. For certain widely used workflow component’s execution, noted in a ProcessExecution input/output formats—e.g., comma-separated value (CSV) (PE) object, as the source of the statements in any output files—a knowledgeable user can quickly wrap a newly (hypothesis) dataset; lineage is manifested in the connections installed component for workflow operation by providing among datasets, process executions, and workflow executions a compact and entirely declarative specification that uses (noted in WorkflowExecution objects). the query language to map specific relation arguments in ProcessExecutionDatasetInput WorkflowExecution the ontology to specific structural elements in the hasProcessExecution* hasParameterName (consistent with Process) hasInputDataset component’s native input and output formats. The toolkit also provides some less fully automated interface options ProcessExecution hasProcess (e.g., GDA) ProcessExecutionDatasetOutput to address more general input/output situations. hasPEDatasetInput* hasParameterName (consistent with Process) hasPEDatasetOutput* hasOutputDataset hasPEControlInput* II. ARCHITECTURAL SCHEME OF A WORKFLOW COMPONENT ProcessExecutionControlInput hasParameterName Fig. 2 presents our general scheme for wrapping legacy hasValue components. Fig. 3 Meta-data classes and attributes for hypothesis datasets As noted in Section I, the interpretation of datasets as a Common Semantic Store context is incremental along its lineage: in general any statement that holds in a dataset that is upstream (workflow- wise) from a given dataset D created during a workflow also (implicitly) holds in D. The representation is thus space- Transform: Transform: efficient. We have not yet found it necessary to implement Query Common Common Assert to Semantic Ontology Native Ontology Semantic such transitivity of dataset contexts directly in the KB query Store  Component  Store Native Native language; our current workflow components use just Format Format background (evidence) datasets and datasets that their Wrapped Component immediate workflow predecessors create. Fig. 2 Component wrapping scheme Fig. 2 schematizes a single wrapped component that executes processes to: 1) Retrieve input data, expressed in the enterprise’s common ontology, from the central semantic store. 2) Format the input data for the legacy component. 3 ?watchlistGraph Group Detection Watchlist-Evidence ?evidenceGraph object, graph, index—“spogi”—format) exists in the workflow Dataset Join Component KB. q- is included in the standard Franz Allegro Prolog ?linkGraph interface to AllegroGraph. • a- indicates that a triple is to be written to the specified Group Detection Component output dataset. An a- conjunct always succeeds. a- and its duplicate-avoiding twin a-- (below) are our contributions that ?outputGraph confer the KB query language’s forward chaining character. Fig. 4 Use case workflow (see Section III) • a-- indicates that a triple is to be written to the workflow KB iff it is not already present there. An a-- conjunct III. USE CASE WORKFLOW always succeeds. • !rdf:type is an example of a shorthand that expands to Fig. 4 presents a use case workflow including both a http://www.w3.org/1999/02/22-rdf-syntax-ns#type — the wrapped legacy component and a KB query component. atom type in the namespace for RDF. (!teo: refers to an In Fig. 4, datasets (graphs) are depicted by square-cornered application-specific ontology.) boxes; workflow components are depicted by round-cornered • ?Event, ?sender, and other symbols beginning with ? are boxes. Each component reads data from one or more input logic programming (AKA Prolog) variables. In the logic graphs and writes to one or more output graphs. Here, a programming style we support, every logic variable dataset join KB query component is used to select from becomes bound when the q- conjunct is matched in the broader evidence (right) just information relevant to KB. watchlisted terrorist suspects (left) for processing by a • Prolog will backtrack to execute each conjunct in the KB downstream legacy group detection component. query for every combination of variable bindings for In our toolkit, the defining forms for workflow components which the preceding conjuncts succeed. are Lisp macro calls. Beyond providing one or more files containing such definitions, ToolKit users need never interact • The KB query language provides a variety of additional directly with Lisp or with AllegroGraph, as we provide constructs (e.g., and, or, not) in which the usual alternative interfaces. expressions that appear as top-level conjuncts may be embedded—e.g., (and (not (q- ?P !rdf:type !teo:Terrorist ?evidenceGraph)) IV. KB QUERY COMPONENTS AND QUERY LANGUAGE (or (q- ?P1 !rdf:type !teo:Terrorist ?evidenceGraph) The definition for the KB query component used in Fig. 4 (q- ?P2 !rdf:type !teo:Terrorist ?evidenceGraph))). appears below. • While the repetition of entity type statements—e.g., (defKB-query-component (a-- ?sender !rdf:type !teo:Person ?linkGraph) group-detection-watchlist-evidence-dataset-join-component —from the input graph is not strictly necessary given our ((and (q- ?Event !rdf:type !teo:TwoWayCommunicationEvent context interpretation, the Tangram contractors agreed evidenceGraph) that it would be convenient to include such declarations (q- ?Event !teo:sender ?sender ?evidenceGraph) uniformly in all datasets. (q- ?Event !teo:receiver ?receiver ?evidenceGraph) Below are the definitions for some utility KB query (q- ?sender !rdf:type !teo:Person ?evidenceGraph) components that we provide with the toolkit distribution. (q- ?receiver !rdf:type !teo:Person ?evidenceGraph) (q- ?sender !rdf:type !teo:Person ?watchlistGraph) (defKB-query-component 2-input-dataset-union-component (q- ?receiver !rdf:type !teo:Person ?watchlistGraph) (DataUnionProcess) (a- ?Event !rdf:type !teo:TwoWayCommunicationEvent ((query (q- ?S ?P ?O ?sourceGraph1) ?linkGraph) (a- ?S ?P ?O ?destGraph)) (a- ?Event !teo:deliberateActor ?sender ?linkGraph) (query (q- ?S ?P ?O ?sourceGraph2) (a- ?Event !teo:deliberateActor ?receiver ?linkGraph) (a- ?S ?P ?O ?destGraph)))) (a-- ?sender !rdf:type !teo:Person ?linkGraph) (defKB-query-component 3-input-dataset-intersection-component (a-- ?receiver !rdf:type !teo:Person ?linkGraph)))) (DataIntersectionProcess) The above component selects events from one dataset ((query (q- ?S ?P ?O ?sourceGraph1) (denoted by the logic variable ?evidenceGraph) whose (q- ?S ?P ?O ?sourceGraph2) participants also appear in another dataset (denoted by (q- ?S ?P ?O ?sourceGraph3) (a- ?S ?P ?O ?destGraph)))) ?watchlistGraph) and asserts the links among them in an output dataset (represented by the logic variable ?linkGraph) (defKB-query-component dataset-de-duplication-component () for consumption by a group detection component. Note the ((query (q- ?S ?P ?O ?sourceGraph) following. (a-- ?S ?P ?O ?destGraph)))) • This component performs a single KB query that The (first) dataset union component writes everything it implicitly conjoins (logically) the twelve top-level (q-, a-, and finds in either of its source graphs into its destination graph; a--) forms. the (second) intersection component writes anything it finds in • A q- conjunct succeeds iff a triple (in subject, predicate, 4 all of its sources into the destination. A workflow author may Native GDA Input: Native GDA Output: choose to follow either of these up with the (third) dataset de- Ev-1194,In-10381 group,entity Ev-709,In-15840 G0,In-10096 duplication component to remove duplicates; note that the Ev-709,In-36232 G0,In-15840 Ev-38749,In-4938 G0,In-19354 author could achieve the same effect by using a-- rather than a- Ev-38749,In-48834 G0,In-19540 conjuncts in the union components’ definitions. Ev-34121,In-3007 G0,In-19625 Ev-34121,In-35214 G0,In-21371 Existing Tangram workflow and process infrastructure Ev-65474,In-21371 G0,In-28719 Ev-65474,In-19354 G0,In-37201 required that we specify the fixed (e.g., two-input) arities for Ev-23484,In-39017 G0,In-37733 Ev-23484,In-16809 G0,In-38634 the components above. This might not be the case in every … G0,In-47910 workflow setting of interest (see Section VIII). Likewise, it G1,In-1002 … might not be necessary to name (or permanently componentize) every query before it can be used. Fig. 5 CSV input/output files for the GDA group detection component Below is a toolkit-based component definition that invokes V. WRAPPED LEGACY COMPONENTS the automatic CSV file interface to wrap GDA. The Toolkit users define wrappers for legacy/native components (completely declarative) definition specifies that GDA- using the Lisp macro defWrapped-component, which affords component-TerroristGroup is an instance of the class a choice among three distinct interfaces. Non-Lisp- GroupDetectionProcess (see [9]). The (keyword) argument programming ToolKit users will want to use one of the first :native-input-CSV-file-specs specifies the relation of the input two interfaces described below; Lisp-programming users are CSV file (to be named "GDA-input-links.csv") to the input most likely to use the first or third. dataset (bound to the Prolog variable ?linkGraph).1 Note that 1) Fully automatic: defWrapped-component writes a comma- the separating character may be specified, using the :text- separated value (CSV) or other delimited text file (to be delimiter argument, and the presence of a headerline via the consumed by the native component) for each input dataset :headerline argument. The argument :native-output-CSV-file- and automatically reads a delimited text file (produced by specs specifies the relation of the output CSV file (to be the native component) for each output dataset. For native named "GDA-output-groups.csv") to the output dataset (bound components with delimited text file-oriented input/output, to ?outputGraph). The remaining top-level arguments specify the ToolKit user need provide no additional wrapping how to invoke the native component. Further explanation code. follows the definition. 2) Semi-automatic: defWrapped-component automatically (defWrapped-component GDA-component-TerroristGroup writes an ntriples file for each input dataset and (GroupDetectionProcess) automatically reads an ntriples file for each output dataset. :native-input-CSV-file-specs The ToolKit user provides additional (presumably non- (("GDA-input-links.csv" Lisp), shell-callable wrapping code as necessary to :query mediate between these ntriples files and the native (query component. (q- ?E !teo:deliberateActor ?P ?linkGraph)) :query-type select 3) Manual: The ToolKit user provides, via an additional :headerline nil argument to defWrapped-component, custom Lisp code to :text-delimiter "," implement the required native component interface. Here :query-template (?E ?P))) we assume that the Lisp programmer will interact directly :native-output-CSV-file-specs with AllegroGraph to create suitable inputs for the native (("GDA-output-groups.csv" component. :query In the sequel, we focus primarily on the fully automatic (query interface. (a- ?G !teo:orgMember ?P ?outputGraph) (a-- ?G !rdf:type !teo:TerroristGroup ?outputGraph) Consider the GDA group detection algorithm [3] from (a-- ?P !rdf:type !teo:Terrorist ?outputGraph)) CMU’s Auton Lab), which uses CSV input and output files as :headerline t shown in Fig. 5. The group detector uses event-based linkages :CSV-template (?G ?P) among individuals to infer groups of associating individuals. :namespace-template Each input line indicates evidence that a certain event involves ("http://anchor/teo#" "http://anchor/teo#"))) a certain individual. Each output line indicates that a certain :native-component-directory "GDA_DISTRIBUTION" individual is hypothesized to belong to a certain group. :native-component-command-name "gda_applic" :native-component-command-arguments ("GDA-output-groups.csv" "GDA-input-links.csv")) 1 The full interface supports any number of native input and of native output delimited text files and corresponding datasets/graphs. 5 Fig. 6 illustrates how the :native-input-CSV-file-specs :CSV-template argument), instantiating the template and argument is processed. binding query variables. Again, the template indicates the Native GDA Input File: General Query Conjunct: order of each bound Prolog variable in each line of the CSV Ev-1194,In-10381 Ev-709,In-15840 (q- ?E !teo:deliberateActor ?P ?linkGraph) file. Note the final template instantiation step that inserts Instantiated Query Conjunct: Ev-709,In-36232 Ev-38749,In-4938 (q- !teo:Ev-1194 !teo:deliberateActor !teo:In-10381 ?linkGraph) appropriate RDF namespaces (per the :namespace-template Ev-38749,In-48834 Ev-34121,In-3007 Ev-34121,In-35214 argument). At right, Fig. 8 illustrates how these bindings are Ev-65474,In-21371 Ev-65474,In-19354 used to instantiate each specified output assertion (query Ev-23484,In-39017 Ev-23484,In-16809 conjunct). Each assertion is executed to add a triple to the … semantic store (with appropriate treatment of duplicates). General Query Template: (?E ?P) Instantiated Query Template: (!teo:Ev-1194 !teo:In-10381) VI. CONCEIVED FULL AUTOMATION FOR COMPONENTS WITH XML INPUT/OUTPUT FILES Fig. 6 Automatic CSV file input mechanism While delimited text input/output formats are quite First, we execute the input query against the input dataset prevalent, they are by no means the only structured formats of (graph). At top right, Fig. 6 illustrates how the query’s single interest. We have also designed (not yet implemented) a (general) conjunct is first specifically instantiated, binding the similar, declaratively-specified wrapping capability for conjunct’s variables to values for which a triple exists in the components with XML file input/output. The general idea is input graph. The :query-template argument specifies how the to embed a similar query specification into the XML file where query’s bound variable values should be ordered in the CSV data is to be read or written. Another alternative on the input file. At bottom, Fig. 6 illustrates the intermediate step of side (only) would be integration of Xpath and Xquery with instantiating the query template, based on the instantiated logic programming. (See [1] for a recent survey.) query conjunct. At left, Fig. 6 shows how we generate one CSV file line per query instantiation.2 (Note that the RDF VII. THE WRAPPING PROCESS namespace, !teo:, is removed, as it is not useful to the native The toolkit’s comprehensive documentation (available from component.) the first author) details the following steps included in the end- Fig. 7 illustrates how the native component is (next) invoked to-end process of wrapping and then deploying components. by the workflow execution system. Execution takes place in a 1) Install the wrapping toolkit. temporary directory specific to the given workflow and 2) Install the native component so that it will be accessible to component instance. the wrapper. Directory: Command-name: Command-arguments: 3) Define any KB query component(s) needed to select $GU_CORE/GDA_DISTRIBUTION gda_applic GDA-output-groups.csv GDA-input-links.csv appropriate data from any broader dataset(s). Fig. 7 Automatic CSV file native component calling mechanism 4) Define the wrapper for the native component. Fig. 8 illustrates how the :native-output-CSV-file-specs 5) Test both KB query and wrapped native components to argument is (next) processed. ensure effective operation. We have developed and applied a testing framework that includes component Query Conjuncts: Gen. (a- ?G !teo:orgMember ?P ?outputGraph) Native GDA Output File: concurrency (i.e., re-entrance) testing. Inst. (a- !teo:G0 !teo:orgMember !teo:In-10096 ?outputGraph) group,entity G0,In-10096 6) Deploy the developed and tested components. G0,In-15840 G0,In-19354 These steps may of course be undertaken by different G0,In-19540 Gen. (a-- ?G !rdf:type !teo:TerroristGroup ?outputGraph) G0,In-19625 G0,In-21371 classes of users. E.g., in a component wrapping team (of Inst. (a-- !teo:G0 !rdf:type !teo:TerroristGroup ?outputGraph) G0,In-28719 G0,In-37201 which an enterprise may have several), one member (the G0,In-37733 Gen. (a-- ?P !rdf:type !teo:Terrorist ?outputGraph) G0,In-38634 G0,In-47910 “installer”) may be primarily responsible for software Inst. (a-- !teo:In-10096 !rdf:type !teo:Terrorist ?outputGraph) G1,In-1002 … installations; another (the “developer”) may be expert with the enterprise’s ontology, workflows, and datasets, the KB query (?G ?P) General CSV / Query Template: language, and the component defining forms; still another (the Instantiated CSV Template: (G0 In-10381) Instantiated Query Template: (!teo:G0 !teo:In-10381) “tester”) may primarily have testing and another (perhaps the “installer” again) deployment responsibilities. “Scripters” Fig. 8 Automatic CSV file output mechanism might write custom Lisp wrapping code or shell scripts or other command line-callable programs to perform data The process is here roughly the reverse of that in Fig. 6. At transformations not (yet) supported by toolkit (semi-) bottom, Fig. 8 illustrates how we first interpret each line of the automation. output CSV file (at right) using the template specified (via the For each component to be wrapped, the wrapping team also 2 should include, or at least have access to, a component This is per the value select specified for the :query-type argument, which indicates that duplicate links (useful to GDA) are to be retained in the “champion” who knows what enterprise function(s) the input dataset. By instead using the (default) value select-distinct, the component must accomplish and understands how the user may alternatively specify one line per unique query instantiation (thus component works well enough to address any wrapping issues removing duplicates). 6 (e.g., whether duplicate assertions are or are not appropriate,  Disbelief in something we earlier had belief in what native component control parameters are appropriate). (perhaps because it had been supplied in error). The champion should bring one or more exemplary use cases  Belief in something we did not have belief in (preferably expressed in terms of the enterprise’s datasets and (perhaps because we had no data about it). ontology) and should help the wrapping team realize the use • Differences in supporting analytical hypotheses, from: case(s) in component (and workflow) definitions.3 o Analyst’s conjecture, or “what-if” analysis (that may Finally, the component wrapping team always should be effect belief or disbelief in data as discussed above). able to present new requirements to the toolkit development o Differences in workflow components giving rise to team (who may serve multiple enterprises). different answers, when: We developed the toolkit during roughly six months of  A given workflow function has alternative concentrated effort, to serve both the broader Tangram realizations in different components. community and ourselves. Starting with the use case presented  A given component has alternative in Section III, we developed first the KB query language and configurations of control parameters. KB query components, then progressively more automatic We have commenced efforts to address these issues both interfaces with which we wrapped GDA (initially). We also formally and with appropriate workflow system infrastructure. have used (or assisted others to use) the toolkit to wrap the ORA group detection algorithm, suspicion scorers based on IX. CONTRIBUTIONS’ RELEVANCE BEYOND TANGRAM the Proximity [7] and NetKit [5] classifiers, and the pattern The use case workflow in Section III includes a generic matchers LAW [9] and CADRE [8]. “Group Detection Component.” While we’ve noted (in We have met the Tangram program’s toolkit usability goals: Section V) that GDA-component-TerroristGroup is an instance as knowledgeable users, we can usually (for components with of the class GroupDetectionProcess, we haven’t said anything inputs/outputs amenable to the toolkit’s fully automatic yet about how such a specific component instance is selected interface) complete Steps 3 and 4 of the above wrapping from among the available alternatives for such a general process within a single staff hour. process class. Beyond enabling semantic interoperability of enterprise workflow components, IARPA’s broader objectives VIII. RELAXING THE CONTEXT MONOTONICITY ASSUMPTION in Tangram have included providing technology for Implicit in the semantics of current Tangram workflow characterizing, for a given generic workflow process, the likely processing is the following monotonicity assumption: A performance of a given specific component with data inputs component’s output graph(s) only add(s), logically, to the having certain characteristics, so that the workflow information in its input graph(s), never delete(s) or retract(s). management system can select the component likely to This is not entirely practical. perform best in any given circumstance. Our toolkit supports The need to manage potentially conflicting source this objective by automating the formal description and information and analytic hypotheses is ubiquitous in an registration of newly defined components in Tangram’s intelligence analysis enterprise. An analyst, surrounded with process catalog [9]. data and applicable tools or methods, may choose to pursue It’s worth noting that all of the toolkit’s other heretofore- one line of reasoning at one time and another later, and described capabilities remain applicable in the (perhaps more different analysts may take different approaches and may build pragmatic) setting where users specify particular components on each other’s analyses or workflow products. Each such for all workflows themselves. approach—a combination of data, tools, methods, and earlier hypotheses—represents a context for analytical reasoning. It REFERENCES is important within the enterprise for each analyst to [1] Almendros-Jiménez, J. M., Becerra-Terón, A., Enciso-Baños, F. J.: understand the actual context of each piece of information that Querying XML documents in logic programming, Theory Pract. Log. s/he might examine and exploit in further analysis—in which Program. 8, 3 (May. 2008), 323–361. [2] Carley, K. M., Dereno, M.: ORA—Organizational Risk Analyzer. Tech. s/he may either extend an existing context or branch to create a rep. CMU-ISRI-06-113, Carnegie Mellon University, August 2006. new subcontext. [3] Kubica, J.; Moore, A.; Schneider, J., Tractable group detection on large Different contexts may arise in workflow-supported link data sets, Third IEEE International Conference on Data Mining (ICDM-2003), pp. 573–576, 19–22 Nov. 2003 analytical reasoning for different reasons, including: [4] Macskassy, S. A., Provost, F.: NetKit-SRL: A Toolkit for Network • Differences in supporting data, from: Learning and Inference, In Proceedings of the NAACSOS Conference, o Conflicting original data sources. June 2005. [5] Murray, K., Harrison, I., Lowrance, J., Rodriguez, A., Thomere, J., o Time-varying data conditions for a given source, such Wolverton, M.: PHERL: an Emerging Representation Language for as: Patterns, Hypotheses, and Evidence, in Proceedings of the AAAI Workshop on Link Analysis, 2005. [6] Neville, J., Jensen, D.: Dependency networks for relational data. In 3 Consider that a champion may also bring a new data source that may Proceedings of the 4th IEEE International Conference on Data Mining, require extensions or other modifications to the enterprise ontology. 2004. Addressing such issues has been the responsibility of a different Tangram [7] Pioch, N.; Hunter, D.; Fournelle, C.; Washburn, B.; Moore, K.; Jones, contractor. E.; Bostwick, D.; Kao, A.; Graham, S.; Allen, T.; Dunn, M.: CADRE: 7 continuous analysis and discovery from relational evidence, International Conference on Integration of Knowledge Intensive Multi- Agent Systems, 2003. pp. 555–561, 30 Sept.–4 Oct. 2003. [8] Wolverton, M., Berry, P., Harrison, I., Lowrance, J., Morley, D., Rodriguez, A., Ruspini, E., Thomere, J.: LAW: A Workbench for Approximate Pattern Matching in Relational Data. In Proceedings of the Fifteenth Innovative Applications of Artificial Intelligence Conference (IAAI-03), 2003. [9] Wolverton, M., Martin, D., Harrison, I., Thomere, J.: A Process Catalog for Workflow Generation, in The Semantic Web—7th International Semantic Web Conference, Springer, vol. 5318/2008, pp. 833–846, 2008.