18 1 Intelligence Analysis and the Semantic Web: an Alternative Semantic Paradigm Brock Stitts Abstract— Intelligence analysis involves gathering data from II. TWO VIEWS ON SEMANTICS multiple and diverse sources. The Internet provides a monstrously large set of diverse sources. It is so large and • Meaning is denotation: words are defined by diverse, in fact, that the project of manually gathering data from reference to the objects or things which they all the potentially useful sources is not feasible. This is where the designate in the external world or by the thoughts, Semantic Web comes into play. With the Semantic Web, web ideas, or mental representations that one might pages are given a machine understandable content such that web associate with them agents can search the internet and perform tasks autonomously. • Meaning is use: words are defined by how they are A key property of this machine understandable content is that it used in effective, ordinary communication.2 must provide for semantic interoperability between the various If one inquires as to how the denotation gets set up between a web pages. The Semantic Web, as its chief advocate, Sir Tim Berners-Lee, admits remains “largely unrealized.” The thesis word and its object, one finds that the answer is that it is by presented here is that by going back to the foundations of virtue of using the word in particular contexts that it receives semantics, we can generate a new hypothesis as to how the its denotation. In other words, communication happens within Semantic Web can be realized. In particular, centering on the context of some human activity. It is this activity that activities (or services) instead of a trying to build a global upper gives words their meaning. The philosopher Ludwig ontology will more effectively cope with semantic interoperability Wittgenstein considers the following simple scenario (the so- issues and thus will help realize the Semantic Web. called "builder's language" introduced in section two of the Philosophical Investigations): Index Terms—intelligence analysis, semantic web, ontology, “The language is meant to serve for communication between semantics a builder A and an assistant B. A is building with building- stones: there are blocks, pillars, slabs and beams. B has to pass the stones, in the order in which A needs them. For this I. INTRODUCTION purpose they use a language consisting of the words "block", "pillar" "slab", "beam". A calls them out; — B brings the stone Applied Systems Intelligence, Inc. (ASI) has developed a which he has learnt to bring at such-and-such a call.”3 methodology for intelligence analysis which involves evaluation of a threat via a parameterized Bayesian belief network (BBN). “Feeding” this BBN to build a threat analysis This is a simple illustration of the basic functioning of involves actively seeking evidence to confirm or deny language. The words are used as “moves” in a kind of parameterized hypotheses. An outstanding data source for this “game.” Wittgenstein coined the term “language game” based analysis would be the Semantic Web. With it, web pages are on this and other examples. In general, the meaning of the given a machine understandable content so that web agents can parts (the words and objects of the activity) is derived from the search the internet and perform tasks, such as retrieving whole (the activity). Likewise, the activity is defined in terms evidence, autonomously. A key property of this machine of its parts. This circle is referred to as the “hermeneutic understandable content must be to provide for semantic circle.” Another way of saying this is: interoperability between the various web pages. The Semantic Web, as its chief advocate, Sir Tim Berners-Lee, admits “It (the hermeneutic circle) refers to the idea remains this “largely unrealized.”1 The thesis presented here is that one's understanding of the text as a that by going back to the foundations of semantics, we can whole is established by reference to the generate a new hypothesis as to how the Semantic Web can be individual parts and one's understanding of realized. First, we begin with a brief discussion of semantics. each individual part by reference to the whole.”4 Instead of seeing words as the “semantic atoms” out of which sentences are built, the semantic unit is a language game (or activity). Much further argumentation can be provided to support this view, but providing this support is the topic of 18 2 another paper. Instead, we assume it to be accurate, and generate a new approach to building the Semantic Web based • Use Google-style page ranking as part of the on it. matching algorithm. This is clearly effective to some degree, but one need only attempt using Google to perform Berners-Lee’s example of the Semantic Web III. AN ALTERNATIVE SEMANTIC PARADIGM in action6 to see why Google only is not sufficient. The goal of this step is really just to generate a set of Underlying the approaches of much symbolic artificial candidate agents. intelligence (AI) is the use of set theoretic concepts. In such • Use case based reasoning (CBR) methods. If one approaches, the world consists of a set of individuals. These thinks of a web service as a “solution” and a web individuals have properties. For an individual to have a agent as having a “problem” it is trying to solve, we property corresponds to its being a member of some set. With see that there is a strong analogy between CBR and such a viewpoint, assertions about individuals are not relative the matching problem.7 to some context. For the approach presented here, individuals • Perform verification. If a web agent has an “answer and their properties are relative. In particular, they are relative key” for selected “problems,” it can use this key to to an activity. The individuals and their properties are verify that it has used a web service appropriately. components of an activity. While these individuals and Likewise, if the web service provides a sample usage properties may be used in other activities, there is no guarantee set, this can also be used for verification. The of synonymy across them. It is the hypothesis here that the importance of this step cannot be understated. This is assumption of synonymy across language games leads to much a key part of cognition and scientific reasoning. In erroneous reasoning. In general, the long chains of inference cognition, the subject generates expectations based on found in some traditional AI systems will be problematic his or her understanding of a situation. If these because they will cut across multiple activities and so will expectations are met, that understanding is verified. contain invalid inferences. Metaphorically, they will be using • Rather than just providing a service’s name, input apples to infer things about oranges. parameters, and output parameters, provide for instructions (in the form of metadata) on how, why, when, and who should use the service. Although A. Application to the Semantic Web these “instructions” would be prone to ambiguity just as all symbols are, they provide a richer data set to As noted above, semantic interoperability between web use in matching. services (or agents) is a prerequisite of the Semantic Web. The general idea on how to do this is to create metadata that Just as the Web gradually grew as content providers built more accompanies web pages. This metadata would contain the content, the approach here would lead to a gradual growth of semantic contents of the web page. The representation of the the Semantic Web. In fact, every piece of this solution would metadata would use the web ontology language (OWL). The evolve over time. Clearly much work needs to be done to assumption by Berners-Lee is that the web agents would use an flesh out the details. ASI is currently at work doing this so as inference engine to reason about this semantic content.5 The to extend its intelligence analysis capabilities. approach here reverses the implicit denotational semantics of Berners-Lee’s approach; instead, a web agent knows the meaning of the name and parameters of a service if it knows IV. CONCLUSION how to use the service. The semantics of a language game are contained in the game itself. With the Semantic Web, If the thesis approach presented here is correct, much of the however, different language games must interact. The problem work in deriving an upper ontology will not be all that of creating the Semantic Web is then essentially a matching productive. With the IEEE suggested upper merged ontology problem. A web agent would try to find an appropriate web (SUMO), for example, there are bound to be numerous cases service to accomplish whatever task it needed to perform. To where its logical axioms are ambiguous; they apply in some do this, it must match up its service request with a web service contexts but not others. Rather than solving the problem of that can fulfill that request. This matching problem is difficult how to keep chains of reasoning consistent, the approach here because any solution must also solve the semantic is not to perform them. The Semantic Web has two interoperability problem. This problem comes about in two components: the Web and semantics. Semantics for natural ways. First, the requester and provider may use different languages are captured in dictionaries. However, dictionaries symbols that mean the same thing. The second, and more are descriptive. Neologisms are generated when new difficult problem, occurs when they use the same symbol but situations arise that call for them, and are created by a wide mean different things by that symbol. To make matters worse, variety of language users. Likewise, the web is built “bottom both problems can occur with a single match. up” by its numerous content providers. Having a committee to define language syntax is workable, but this does not hold for This matching problem has no easy solution. What we outline semantics. The semantics of a language is the set of uses of here are a proposed set of techniques to solve it. 18 3 that language. How to use and grow that language is best left to the users of the language. 1 Nigel Shadbolt, Wendy Hall, Tim Berners-Lee (2006). "The Semantic Web Revisited". IEEE Intelligent Systems. Retrieved on 2007-04-13. 2 See http://en.wikipedia.org/wiki/Philosophical_Investigations 3 See http://en.wikipedia.org/wiki/Language-game 4 See http://en.wikipedia.org/wiki/Hermeneutic_circle 5 See http://www.sciam.com/article.cfm?id=the-semantic- web&print=true 6 See http://www.sciam.com/article.cfm?id=the-semantic- web&print=true 7 See http://en.wikipedia.org/wiki/Case-based_reasoning