Accessing Information and Services on the DAML-Enabled Web Grit Denker, Jerry R. Hobbs, David Martin, Srini Narayanan, Richard Waldinger SRI International Menlo Park, California, USA denker@csl.sri.com, fhobbs, martin, narayana, waldingeg@ai.sri.com ABSTRACT ence during search, and the declarative representation of the capa- The DARPA Agent Markup Language (DAML) program aims to bilities of services and procedures available on the Web. allow one to mark up web pages to indicate the meaning of their 1.1 Some DAML Queries content; it is intended that the results delivered by a DAML-enabled browser will more closely match the intentions of the user than is To illustrate what is desired on the Semantic Web, we present possible with today’s syntactically oriented search engines. a set of queries of increasing complexity expressed in natural lan- In this paper we present our vision of a DAML-enabled search guage, which should be answered efficiently in our framework. The architecture. We present a set of queries of increasing complexity queries, which are concerned with the general area of searching for that should be answered efficiently in a Semantic Web. We describe and obtaining publications via the Internet, illustrate roughly in- several scenarios illustrating how queries are processed, identifying creasing complexity in the requirements for the search process. the main software components necessary to facilitate the search. We examine the issue of inference in search, and we address how to 1. Find information about a researcher with name “James Hend- characterize procedures and services in DAML, enabling a DAML ler”. (This can very nearly be accomplished with keyword query language to find web sites with specified capabilities. searches, although for very common names the search engine would return too many hits by not filtering on “researcher”.) Key Words: Semantic Web, DAML, inference, Web services, pro- cess modeling. 2. Find a reference to a paper about SHOE that is co-authored by Hendler. (SHOE was named in part because Web searches 1. INTRODUCTION for it would also turn up sites on shoes. DAML will restrict Querying the Web today can be a frustrating activity because the the search to references to publications about SHOE.) results delivered by syntactically oriented search engines often do 3. Find a reference to the most recent paper on SHOE with not match the intentions of the user. This problem is caused by the Web’s lack of semantic structures that could be exploited during James Hendler as a co-author. (For this query, the search engine must find all references possible within search con- the search process. straints, find their dates, and pick the most recent. It must DARPA’s DAML program (http://www.daml.org) aims know about the structure of references and about date arith- at overcoming this problem. The DARPA Agent Markup Language metic.) is intended to allow annotating web pages to indicate the meaning of their content. Thus, search engines are supported in their deci- 4. Request the Stanford Library to hold a copy of Daniel Den- sions about the appropriateness of a web page as an answer to a nett’s book “Elbow Room” for me. (Here the search engine query and are able to extract the most appropriate information. must deal with service and procedural aspects of web pages. DAML allows specifying ontologies and annotating Web content A password is required, and logging in is necessary, as well with respect to these ontologies. Together with a yet-to-be-defined as searching for the book on the library’s web site and exe- query mechanism, these DAML annotations are intended to ease cuting the process for requesting a hold.) and improve searches on the Internet. In this paper we present our vision of a DAML-enabled search architecture by outlining differ- 5. Buy a copy of Daniel Dennett’s “Elbow Room” for me from ent software components and their functionality, the use of infer- amazon.com. (This again requires executing a procedure, but  Supported by the Defense Advanced Research Projects Agency in addition involves security concerns in sending in credit through the Air Force Research Laboratory under Contract F30602- card information.) 00-C-0168. These examples motivate our concerns in the long term. In this paper we will illustrate what is required to handle them. In Section 2 we describe several scenarios illustrating how queries Permission to make digital or hard copies of all or part of this work for are processed. Along the way we identify the main software com- personal or classroom use is granted without fee provided that copies are ponents necessary to facilitate the search, and we summarize the not made or distributed for profit or commercial advantage and that copies software architecture for DAML-enabled search and outline the bear this notice and the full citation on the first page. To copy otherwise, to main functionalities of its components. In Section 3 we examine republish, to post on servers or to redistribute to lists, requires prior specific the issue of inference in search. Although DAML is still in the pro- permission by the authors. Semantic Web Workshop 2001 Hongkong, China cess of being defined and its query language has not yet been spec- Copyright by the authors. ified, we have implemented the inference scenarios in first-order logic on an automatic theorem prover. As DAML develops, we we cite the SRI Researcher ontology that has lastName and are defining the mapping between it and this notation. Finally in firstName as properties. FIND means Section 4 we address the issue of characterizing procedures and that we expect the query to return objects of class Researcher services in DAML in terms of an existing model of complex pro- (including all properties that are defined for this class). We restrict cesses. This will enable a DAML query language to find web sites our result to those researcher objects that have “James Hendler” as with specific capabilities. name. After parsing the query, it would be handed to a search engine 2. A DAML-ENABLED ARCHITECTURE DAML-Q (for “DAML Query engine”) that would be capabable of processing queries written in the DAML query language. During FOR QUERYING THE WEB its search it makes use of the ontology information that is part of the query. 2.1 Query Processing: Example Scenarios We present the flow of data and control for DAML query pro- cessing, with the help of the first example query of the previous section. xmlns:SRIRes = "http://www.ai.sri.com/daml/ DAML ontologies for publications, researchers, and topics have ontologies/Researcher#"> been built. Here we present only those parts necessary for process- James ing our example queries. More details can be found under Hendler Dr. www.ai.sri.com/daml/ontologies/ .... in Publication.daml and Researcher.daml. In Publication.daml a class Publication-ref has two properties among others: author, of type Researcher, and topic, of type Topic. The class Researcher is declared in Figure 2: Scenario 1 Researcher.daml along with its properties lastName and Assume a DAML-annotated web page as illustrated in Fig. 2. firstName of type String. The property name as well as DAML-Q selects this web page for further inspection because the subTopics and relatedTopic are properties of the class Topic. ontology cited in the query matches one of the ontologies in the Fig. 1 summarizes the classes and their relationships via property list of namespaces of the web page. From the xmlns declaration DAML-Q obtains the namespace identifier (here “SRIRes”) and declarations, where List is taken to be a way to form lists. sequentially searches the content of this web page for a tag James firstName and author name Hendler and select objects that satisfy these criteria. Note that DAML-Q topic needs to be capable of recognizing the various equivalent notations Publication-ref Topic that can be used instead, such as relatedTopic List or subTopics Hendler Figure 1: Ontologies We use an ad hoc notation for DAML queries. Our purpose is among others. The query result will contain the researcher object not to suggest a particular query language, but rather to identify (with all its properties) defined in the web page illustrated in Fig. concepts necessary to support DAML-enabled search. 2. The first query, “Find information about a researcher with name A slightly more complex situation is one in which a web page James Hendler”, may be formalized as follows: does not refer to the ontology stated in the query but is related to that ontology by some means. For example, assume a web xmlns: SRI = ‘‘http://www.ai.sri.com/daml/ page that declares the ontology www.ai.sri.com/˜hobbs/ ontologies/Researcher#’’ MyResearcher.daml. A search through this web page shows FIND that this ontology is defined in terms of the SRI Researcher ontol- SUCH-THAT ogy. More precisely, a subclass MyResearcher of Researcher Hendler is declared in www.ai.sri.com/˜hobbs/MyResearcher.daml James that adds properties to the Researcher class of the original SRI END ontology. Using this information, DAML-Q can infer that a web page like the one in Fig. 3 is a match for the query. DAML-Q ex- The query specifies the ontology (or ontologies) it uses. In our tracts from the ontology www.ai.sri.com/hobbs/MyRe- example we search for information about researchers. Therefore, searcher.daml and the given declaration, It is obvious that the task of mapping ontologies, classes and properties is nontrivial. Already in our simple example one has to xmlns: SRI = "http://www.ai.sri.com/daml/ be able to map two properties into one. We envision that DAML ontologies/Researcher#"> dard interface for ontology mapping. This ontology has to provide .... for complex mapping operations such as mapping different kinds of —————————- combinations of properties or classes. Given such a “standardized xmlns: hobbs="http://www.ai.sri.com/ hobbs/ ontology mapping interface”, a software module like DAML-M MyResearcher#"> can collect mapping information from different sites that advertise James such mappings using the “DAML mapping” ontology. Here trust Hendler issues are critical: A search engine like DAML-Q might prefer to .... get mapping-related information only from specific sites since they have proven reliable with respect to the mapping assertions. Trust and security issues are orthogonal to other parts of the query. The Figure 3: Scenario 2 DAML query language also needs to be able to support the specifi- cation of such requirements. Further complexity may be introduced by less-defined DAML queries. For instance, the user may be aware that “James Hendler” that the class tag to be searched for is MyResearcher (the prop- is often also referred to as “Jim Hendler”. Thus, the formaliza- erty tag remains lastName). tion of the query may leave the specifics of the first name open, or suggest alternatives that are considered equivalent. For instance, Instead of placing the burden of recursively searching through the query “Find information about a researcher with name James web pages that have direct or indirect references to ontologies con- Hendler” may be formalized as tained in the query on DAML-Q, it seems much more efficient to xmlns: SRI = "http://www.ai.sri.com/daml/ implement a mapping service for ontologies, which we may call ontologies/Researcher#}" DAML-M. DAML-M would be a service that for a given ontol- FIND ogy and a set of classes and properties returns mappings to other SUCH-THAT Hendler ontologies and their classes and properties that are declared to be equivalent. For instance, the Knowledge Systems Lab of Stanford OR( University has declared a class PERSON in its ontology www.ksl. James , stanford.edu/projects/DAML/ksl-daml-desc.daml. The property HAS-FULL-NAME of type STRING is declared for this Jim class. Contacted by DAML-Q with the query at hand, DAML-M ) would return this information indicating that web pages that refer END to the KSL ontologies and that have content marked as an object of class PERSON with property HAS-FULL-NAME are also possible It is also possible that one does not know the exact first name, answers to the query. Fig. 4 illustrates this situation. but believes that it is “James” or something similar. Assume that there exists an Internet service that for a given first name delivers names that are commonly used instead of the given first name or that are considered equivalent or a nickname or an abbreviation. For instance, there might be a “Name-Match” service available that given a name like “Jim” returns “James” and “J.”. A query that keeps the first name open and rather relies on such a service could look as follows: xmlns: SRI = ‘‘http://www.ai.sri.com/daml/ ontologies/Researcher#’’ xmlns: S = ‘‘http://www.ai.sri.com/daml/ —————————- ontologies/Service#’’ http://www.ai.sri.com/daml/ontologies/Researcher# FIND --> http://www.ksl.stanford.edu/projects/ SUCH-THAT DAML/ksl-daml-desc#"> Hendler Researcher --> PERSON firstName lastName --> HAS-FULL-NAME —————————- USE xmlns: KSL="http://www.ksl.stanford.edu/ projects/DAML/ Name-Match ksl-daml-desc#"> James Hendler James ... END Figure 4: Scenario 3 The intended meaning of this ad hoc notation is that DAML-Q is supposed to make use of a service in order to determine the possible first names. The name of the service to be used is Name-Match ler is a person. While james-hendler refers to a unique indi- and it is to be consulted with the parameter James. This ser- vidual (ultimately a URI), the name (personq "James Hend- vice would deliver a set of possible names that are associated with ler") is shared by several people. These distinctions seem pedan- “James” that will be used in processing the query. Service specifi- tic, but when we try to ignore them we get into trouble. cations of various kinds are one of the key foci of our research and The relation person-val relates a name of a person with the are described in greater detail below. person itself. Thus, if As a last possible scenario we consider the situation where a service on the Web has stored specialty information, the so-called (person-val ?personq ?person) DAML-R (for DAML speciality repository). DAML-R would pro- holds, ?personq is a name for ?person. While, by convention, vide comprehensive information for specific topics. For instance, ?person, ?person0, ?person1, etc., are variables that range there may exists a service that has cached links to home pages of over people, ?personq, ?personq0, etc., range over names of researchers in a specific field, or that has cached the most relevant people. information about those researchers locally (e.g., their names, af- A paper is of sort paper; a reference to a paper is of sort paperq. filiations, publication references). Such a service can obtain its in- Thus we think of a reference as a kind of name for a paper. formation offline, and manipulate and store it in a way that allows Other concepts are best explained in the context of the example. high throughput of queries. We consider Query 3 from Section 1.1: 3. INFERENCE IN QUERIES Find a reference to the most recent paper on SHOE We have shown how the differing ways information can be rep- with James Hendler as a co-author. resented on the Web can call for successively more complex pro- This query may be phrased in the SNARK language as: cessing, even for simple queries. But for complex queries, this is even more true. In our initial list of sample queries, all but the first (find ?paperq two require significant inference capabilities. such-that As part of our research, we have been considering what kinds (and of inference are necessary to answer plausible queries and to per- (pub-val ?paperq ?paper) form typical tasks using Web-based documents and services. We (author ?paper ?person) have been considering what constructs will be necessary in the lan- (person-val (personq "James Hendler") ?person) guage to allow designers of a web site to advertise its services. Also (about ?paper (topic "SHOE")) (= (pub-to-year ?paper) (year-fn ?natural))) we have been investigating the representation of the background prefer knowledge necessary to connect queries with the web sites that sup- starts-after-starting-of ply the answers. on We have been using the theorem prover SNARK [11] as a vehicle (year-fn ?natural) for our experiments with inference. This is not to say that SNARK time-limit needs to be incorporated into a DAML search engine, but rather 10) that by studying the SNARK inferences, we can see what kinds of In other words, we want to find a reference ?paperq such that processing suffice to handle DAML queries. SNARK is an automatic first-order theorem prover, implemented  ?paperq is a reference to ?paper—the relation pub-val, in Common Lisp, that can be tuned and specialized to exhibit high analogous to person-val, relates a publication reference performance in particular subject domains. It has a highly evolved with the corresponding publication. sort structure that enables us to represent taxonomic information concisely and to infer consequences quickly. It has built-in facil-  ?person is an author of ?paper—there may be other au- ities for fast temporal reasoning. It has answer-extraction facili- thors. ties, which allow it to answer questions instead of merely proving  (personq "James Hendler") is a name for ?person. theorems. These facilities were developed for deductive program synthesis [2], [6], but apply as well to query answering.  ?paper is about the topic SHOE; if the topic ontology con- SNARK has a procedural-attachment mechanism, which makes tained subtopics of SHOE, papers on those subtopics would it possible to link symbols in the logic with procedures; when the be acceptable. symbol is employed in the proof, the corresponding procedure is executed. This enables us to invoke outside sources, including web  ?paper was published in year (year-fn ?natural); sites, while the proof is in progress. The examples in this section e.g., while 2000 is a natural number, (year-fn 2000) are expressed in SNARK notation; we intend that this language be is the year 2000. inter-translatable with the logic language of DAML. Furthermore, if there are several papers found that satisfy the 3.1 An Experiment with Document Search above criteria, we prefer the latest one. The relation As an experiment, we consider what kinds of inference are re- starts-after-starting-of quired to answer queries for searching for documents. We phrase this query in terms of the ontologies and theories we have devel- is a relation on time intervals such that oped for bibliographical references, names, addresses, topics, and dates. We then use SNARK to find answers, based on theories we (starts-after-starting-of ?time-interval1 have developed for these areas. ?time-interval2) In reading the example, one must distinguish between strings, names, and entities. For example, "James Hendler" is a string, holds if ?time-interval1 starts after ?time-interval2; (personq "James Hendler") is a name, and james-hend- thus (starts-after-starting-of Let us also assume that we have the reference to a later SHOE (year-fn 2000) paper ([3]): (year-fn 1997)) (assert We execute this query by first finding a single paper reference ’(pub-val that meets all our criteria. We then look for another one, with a later (inproceedingsq publication date. We do not ask for the latest of all of Hendler’s author (coq publications, since we can never be sure if we have seen all of them; (personq "Jeff Heflin") (personq "James Hendler")) instead, we give the search a time limit, and return the latest paper title (titleq "Searching the Web with SHOE") we have found in that time. booktitle We assume we have access to DAML-annotated publication lists; (titleq the following is a description of a 1997 paper on SHOE ([5]), of "Artificial Intelligence for Web Search. which Hendler is a co-author. Papers from the AAAI Workshop.") year (year-fn 2000) "Sean Luke" publisher (publisherq "AAAI Press") "Lee Spector" number (pubnumber "WS-00-01") "David Rager" address (cityq "Menlo Park" "CA" "US") "James Hendler" topic (topic "SHOE")) shoe-aaai-paper) "Ontology-based Web Agents " :name ’shoe-aaai-paper-reference) Let us follow a few steps of the SNARK inference process by "Proceedings of the First International which an answer was found. Conference on Autonomous Agents (Agents97)" We begin with a logical sentence obtained from the query "1997" (Row 193 (or (not (pub-val ?paperq ?paper)) "Association for Computing Machinery" (not (author ?paper ?person)) (not (person-val "New York, NY, US" (personq "James Hendler") "SHOE" ?person)) (not (about ?paper (topic "SHOE"))) (not (= (pub-to-year ?paper) This may be represented in SNARK notation as: (year-fn ?integer&nonnegative)))) (assert Answer (ans ?paperq ’(pub-val (year-fn ?integer&nonnegative))) (inproceedingsq author (coq Note that, because SNARK is a refutation procedure, queries are (personq "Sean Luke") negated, and inference proceeds until a contradiction is obtained. (personq "Lee Spector") Also note that the query, and its logical descendents, is accompa- (personq "David Rager") (personq "James Hendler")) nied by an Answer expression indicating what answer we expect title (titleq "Ontology-based Web Agents") to obtain from the proof. Because our preference is based on the booktitle year, we include the year of publication as part of our answer. The (titleq "Proceedings of the First expression integer&nonnegative is SNARK’s internal nota- International Conference tion for the sort of natural numbers. on Autonomous Agents Formulas and their associated answers are called “rows”. (Agents97)") year (year-fn 1997) We omit some details of the proof. publisher Using this assertion, and others from our publication ontology, (publisherq we obtain the first answer "Association for Computing Machinery") (Row 335 address false (cityq "New York" "NY" "US") (resolve 334 name-of-lee-spector) topic (topic "SHOE")) Answer (ans 97:_ont_ba_web_ag) (assert :name ’shoe-acm-paper-reference) ’(pub-val Here (inproceedingsq author (coq (coq ?personq1 ?personq2 ...) (personq "Sean Luke") (personq "Lee Spector") is a publication list containing names ?personq1, ?personq2, (personq "David Rager") : : : . Names, indicated by the keyword :name, have no logical or (personq "James Hendler")) functional significance; they are used only to make the proof traces title (titleq "Ontology-based Web Agents") easier for us to follow. Our representation for references in logic is booktitle derived from one developed in Maude [1] by Meseguer. (titleq "Proceedings of the First The reference indicates that the paper appears in a conference International Conference on Autonomous Agents proceedings, gives the title of the paper, the title of the proceedings, (Agents97)") the date of publication, the publisher, the address of the publisher, year (year-fn 1997) and the topic. The notation for references in this theory is based on publisher that of Bibtex. (publisherq "Association for Computing Machinery") It is also possible to obtain lists of papers, ordered by our prefer- address ence, so that the more recent are returned first. (cityq "New York" "NY" "US") We can also use inference to combine capabilities of multiple topic (topic "SHOE")) web sites. For example, suppose we had forgotten Hendler’s name, (year-fn 1997))) but remembered he worked for the University of Maryland. Then Note that, since SNARK is a refutation procedure, obtaining a row we could phrase the query false indicates that a proof is complete. (find Since we want the latest possible publication, we begin a new ?paperq query, to find a publication later than 1997: such-that (Row 324 (and (or (not (pub-val ?paperq ?paper)) (pub-val ?paperq ?paper) (not (author ?paper ?person)) (author ?paper ?person) (not (person-val (employed-by (personq "James Hendler") ?person ?person)) (org "University of Maryland")) (not (about ?paper (topic "SHOE"))) (about ?paper (topic "SHOE")) (not (= (pub-to-year ?paper) (= (pub-to-year ?paper) (year-fn ?integer&nonnegative))) (year-fn ?natural))) (not (starts-after-starting-of prefer (year-fn ?integer&nonnegative) starts-after-starting-of (year-fn 1997)))) on Answer (ans ?paperq (year-fn ?natural) (year-fn ?integer&nonnegative))) time-limit 10) This query is the same as the one we started with, except it con- tains an additional condition: The inference process can then consult the University of Maryland web page to be sure that at least one of the co-authors of the papers (starts-after-starting-of (year-fn ?integer&nonnegative) it finds in the bibliography is employed there. (year-fn 1997)) SNARK is limited to first-order logic and has no special facilities for combining theories. We have been working with Jose Meseguer In other words, we insist that the publication we find is more recent to connect Maude [1], a higher-order language with metalogical ca- than 1997. pabilities, with SNARK. This would make our presentation of the- The refutation proceeds similarly to what we have seen already, ories simpler and more elegant. For instance, now we have separate except this time SNARK finds the second reference, to the AAAI sorts, personq and person, for names and people respectively. 2000 paper. Similarly we have separate sorts, cityq and city, for names of SNARK uses a temporal reasoning procedure, based on the Allen cities and the cities themselves. With Maude we will be able to have temporal calculus, to check temporal constraints, whether they arise a function name that maps each sort into the corresponding name directly from the query or are a consequence of our preferences. sort; thus we would not need new sorts personq and cityq. This allows us to determine that the Year 2000 paper is indeed more Also Maude would give us more flexible syntax and the abil- recent than the 1997 paper. (It could also discriminate on the basis ity to develop new theories in a modular way, by instantiating and of publication month and day.) Thus, the AAAI paper satisfies all combining old ones. the constraints and is our next best selection: (Row 421 4. DEFINING AND ACCESSING false (resolve 420 name-of-jeff-heflin) WEB SERVICES Answer In the preceding sections, we have discussed the mechanisms by (ans which DAML will facilitate the representation and discovery of ob- (inproceedingsq jects containing information on the Web. We turn now to a consid- author eration of services on the Web, with two basic observations: first, it (coq (personq "Jeff Heflin") should be possible to use these same mechanisms for representing (personq "James Hendler")) and discovering services; second, there is a close, interleaved rela- title tionship between querying, or searching, the Web, and making use (titleq "Searching the Web with SHOE") of services on the Web. Queries 4 and 5 of Section 1 are requests booktitle for services, and not merely for information. (titleq Today, for the most part, services on the Web are created for "Artificial Intelligence for Web Search. Papers from the AAAI Workshop.") human consumption, with interfaces intended for humans, not for year (year-fn 2000) software agents. In contrast, a DAML-enabled Web, while still publisher (publisherq "AAAI Press") allowing for human consumption of services, will make software number agents first-class citizens of the Web, with full access to its services. (pubnumber "WS-00-01") This, in turn, will allow the number and variety of services offered address (cityq "Menlo Park" "CA" "US") on the Web, and the efficiency with which they are used, to continue topic (topic "SHOE")) (year-fn 2000))) to increase at a dramatic rate. Web services can be of many types, including but not limited SNARK will continue to look for Hendler SHOE papers more to the types that are already familiar to human users: querying recent than 2000; when the time limit is exceeded, it returns the of databases, catalogs, digital libraries, and other types of infor- reference to the best paper we have found. mation repositories, searches and classification services provided by portals, business-to-consumer (B2C) transactions, business-to- (city "New York" "New York" "US") business (B2B) transactions, etc. It is worth noting that Web ser- vices are not limited to the two-party, client/server approach most is the abstract entity New York City. Similarly, the function air- commonly used. Services can involve any number of parties, with port maps a name-string or a code for an airport into the corre- complex patterns of interaction between them. For example, a sponding airport. business-to-business transaction may well involve a buyer, a seller, This service provided by Travelocity can be advertised in terms a financer, and a shipper. of other concepts from the same or other ontologies. For instance, To make use of a Web service, a software agent needs a formal the above assertion does not say anything about the distances that description of the service, and the means by which it is accessed. the web site computes, between a city and its local airport. That An important goal for DAML, then, is to establish a framework information is provided by an additional assertion. within which these descriptions are made and shared. (assert ’(implies 4.1 Logical Advertisement of Services (city-airport-tvl The author of a DAML-annotated web page will need to provide ?city-string ?state-abbr ?country-abbr sufficient information to link the services provided by the web page ?airport-code ?airport-string ?real ?string) with the concepts in an appropriate theory or ontology. Usually (= (distance-between there will not be an exact match between the service and a sym- (city ?city-string ?state-abbr ?country-abbr) bol in the theory; rather, the author will need to describe a logical (airport ?airport-code)) relationship between each service and the concepts of the theory. (miles ?real))) For one thing, the web site functions are defined on concrete data types, such as strings or real numbers; the functions and relations :name ’travelocity-city-airport-distance in the theory will be defined on abstract entities, such as cities and :documentation "Travelocity can determine the distance between people, independently of how they are represented. A query will a city and its local airports") not necessarily be phrased in terms of the same representation se- lected by a web site. Most people will not even know the concrete Here, distance-between gives the abstract distance between functions and relations computed by particular web sites. two places, independent of unit of measurement; miles maps a For example, the Travelocity site www.travelocity.com concrete real number into an abstract distance, the corresponding computes many functions. One of them returns the airports local distance in miles. There are other functions, for kilometers, feet, to a city, with their respective distances from the city. We can- etc. This representation does not identify numbers with distances; not assume that the user knows that Travelocity provides this ser- rather, it allows web sites that only deal with miles to communicate vice. There may, however, be an ontology that defines an relation with web sites that only deal with kilometers. city-airport, that holds between a city and its local airports, These examples show the importance of having some logical which are abstract entities. power to advertise the services of web sites. We can see the value of The service that Travelocity provides for finding local airports a sorted logic with equality. Although one could phrase the same can then be advertised using the following logical axiom. assertions in unsorted logic without equality, they would be un- (assert wieldy to write and more inefficient to reason with. In particular, ’(implies without sorts we would need to prefix each assertions with condi- (city-airport-tvl tions, such as ?city-string ?state-abbr ?country-abbr ?airport-code ?airport-string ?real ?string) (implies (city-airport (and (city ?city-string ?state-abbr ?country-abbr) (real ?real1) (airport ?airport-code))) (city-string ?city-string) ...) :name ’travelocity-city-airport ...) :documentation "Travelocity can determine which airports are Without equality, it is peculiarly awkward to reason about func- local to a given city. The Travelocity tions, for example to express the uniqueness of their values. relation city-airport-tvl implements In addition to such linking assertions, we will need other asser- the ontology relation city-airport") tions for background inference, to link queries with web sites and While city-airport is the abstract relation between a city to link different web sites together. For example, the following and its local airports, city-airport-tvl is the concrete rela- assertion tells us that the time required to travel from one place to tion computed by the Travelocity web site that finds airports local another is limited by the distance between them and by the person’s to a given city, and their distances in miles from the city. While best speed. city-airport applies to cities and airports, which are abstract (assert entities, city-airport-tvl applies to concrete strings and real ’(implies numbers, which are a particular representation of the abstract enti- (and ties. The relationship between the concrete strings and numbers and (at ?person ?place1 ?time-point1) the abstract entities is expressed by functions and relations from the (at ?person ?place2 ?time-point2)) ontology itself. (=< (distance-between ?place1 ?place2) Travelocity choses to represent a city with three name-strings, (*-quantity for the city, the state (or region), and the country, respectively. For (abs-quantity instance, the function city maps these name-strings into the ap- (time-minus ?time-point1 ?time-point2)) propriate city, an abstract entity. Thus (speed-between ?place1 ?place2) provides ))) Resource Service :name ’distance-equals-rate-times-time :documentation presents supports "If a person is going from ?place1 at ?time-point1 to ?place2 at ?time-point2, the imple ments distance between the points is less than or ServiceProfile ServiceGrounding equal to the product of the difference between the times and the best speed between the two places.") :KDWWKH ServiceModel +RZWR VHUYLFHGRHV DFFHVVLW While this assertion does not directly refer to any particular web +RZLWZRUNV site, it is useful in handling queries that do invoke web sites. For in- stance, if we are using Travelocity to book a trip, the decision about Figure 5: Top level of service ontology which flight to book might be determined by the time required to travel between the airport and the ultimate destination city. It may not be necessary to express background assertions in the DAML language, but it will be necessary to invoke such assertions  What does the service require of the users and provide for in the course of handling queries. It will be important that people them? writing DAML-annotated web pages should find it easy to write declarative statements that describe how the services offered by  How does it work? that web page are related to some theory. While they need not write  How is it used? them in logic directly, they will need to write them with some tool that translates them into logic. If the logical language is less expres- As shown in Fig. 5, these three types of knowledge are captured sive, that translation will be more difficult. It is not true that using a by the built-in properties presents, implements, and supports, of less expressive language will give us more efficiency of inference. class Service. These properties range over built-in classes Service- If we paraphrase the above assertions without using equality and Profile, ServiceModel, and ServiceGrounding, respectively. These sorts, the result will be less efficient inference, not more. are “placeholder” classes; that is, they have no important proper- ties defined. The substantive properties are left to be defined by 4.2 Toward an Ontology for Describing Ser- their various subclasses. For each descendant class S of Service, vices there will exist corresponding subclasses of ServiceProfile, Service- Web sites should be able to employ a set of basic classes and Model, and ServiceGrounding, containing the properties that can properties for declaring and describing services, and the ontology most appropriately represent the required information for class S . structuring mechanisms of DAML provide the appropriate frame- It is important to note that logical rules may be employed in any work within which to do this. In this subsection, we sketch out an or all of (the subclasses of) ServiceProfile, ServiceModel, and Ser- initial proposal for these basic classes and properties. viceGrounding, so as to support the uses of inferential reasoning A note on terminology: In what follows, when we refer to the discussed earlier in this paper. “user” of a service, we are thinking of a software agent, unless stated otherwise. (Of course, a software agent, in most cases, acts  Service profiles. The specification of what a service requires and provides calls for an external, or “black-box”, view of on behalf of some human user.) We propose here some general principles by which a DAML on- the service, giving such things as preconditions that must hold before the service can be requested, required inputs, ef- tology of services may be organized, and a few fundamental classes fects of the service, expected outputs, cost of accessing the and properties that will provide the backbone of any DAML service service, acceptable forms of payment, and so on. The ser- ontology. (We will refer to these fundamental classes and prop- vice profile is meant to provide the essential information by erties as “built-in” classes and properties, because we expect they which a potential user of the service (or perhaps a match- will form a coherent layer of the DAML language, which is generic maker agent) can determine if the service meets the user’s enough to be used as an upper ontology for all service descriptions.) needs. The role of a service profile is analogous to that of a Note, however, that we are not proposing any particular taxonomy procedure declaration (header) in a typed programming lan- of service types; indeed, we are neutral as to whether there should guage, or the schema of a database relation. (Indeed, for be one such taxonomy, a few, or a great many. Our concern here is many simple, one-step, services, the service profile may be to provide the mechanisms and concepts by which any number of essentially the same as a procedure header or schema.) such hierarchies can be constructed and used. Our proposal begins, naturally enough, with built-in class Ser-  Service models. The specification of how a service works vice. The categories of a taxonomy of services will be subclasses calls for an internal, or “glass-box” view of the service. The of class Service, and will be structured according to the needs to a type of representation used to display this information will given domain or application. For example, a bookseller’s web site vary with the nature of the service. For example, a purchase might make use of a toplevel B2C-Service class, the immediate sub- from a Web site may involve a number of steps (specifying classes of which might be B2C-InformationService (including such quantity, desired color, and other purchase attributes; giving things as search and recommendation of books), B2C-Transaction, shipping address; securely inputting credit card information; B2C-AccountMaintenance, and B2C-PurchaseTracking. verification of credit card information; final confirmation). The service model describes the shared knowledge of these 4.3 The Fundamental Properties of a Service steps, which is required for the user(s) and service provider to Our structuring of the ontology of services is motivated by the coordinate their activities. For many such services, the repre- need to provide three essential types of knowledge about a service, sentations associated with process modeling will be essential which may be intuitively characterized by these questions: in describing this knowledge. Thus, we expect there to be one or more important subclasses of ServiceModel based on the world and the dynamics of interaction. In this way, services are process modeling. One such approach, under development at more like verbs and actions than like nouns. Thus representing ser- SRI, is described in the following section. vices requires tapping into an ontology/theory of actions, processes and events.  Service groundings. A service supports one or more service groundings, each of which spells out the implementation- 5.1 A Core Theory of Processes and Events specific details by which the user communicates with the Consider the examples of Queries 4 and 5 (Section 1). These service, such as message formatting, transport mechanisms, queries are information requests that result in actions changing the etc. A service grounding will typically be a well-known set state of the world in a specific way (one results in a buying action of specifications for exchange and interaction, and will play from amazon.com, the other in placing a library hold on a book). a role such as is played today by standards such as HTTP In order for these queries to be processed, the Web agent needs to forms protocol, CORBA IDL, SOAP, OAA’s ICL, KQML, know at least the following: a) input/output format/protocol(OAA, Java’s RMI, and the like. Jini), b) parameters of interaction (payment method, shipping method, In addition, the grounding must specify, for each abstract etc., c) dynamics of interaction (steps involved and the internal tem- type specified in the ServiceModel, an unambigous way of poral structure of a buying action), d) preconditions/effects of a exchanging data elements of that type with the service (that specific action (selecting a book results in its being in the shopping is, marshalling / serialization techniques employed). The cart), e) resources consumed (money), produced or locked (credit- likelihood is that, as a result of market forces, a relatively card number), f) ways of controlling the transaction, such as short small set of groundings will come to be widely used in con- cuts (using stored profile information), interrupting or canceling junction with DAML services. Groundings will be declared from various stages/states in the interaction (you can cancel an or- and documented in detail at various well-known URIs. der before shipping). Continuing with the example, note that buying is not an isolated In a nutshell, the relationship between these three fundamen- concept but is integrated with a cluster of concepts in what might tal classes is this: The service model, which provides the fullest be called the commercial transaction frame. The commercial trans- description of the service, gives an abstract, semantic description action frame involves such concepts as possession, change of pos- of what the service does. (Thus, a given service model could be session (giving, taking/receiving), exchange (the parties in the ex- reused for a variety of services accessed in widely varying ways.) change accept and are expected by their community to accept the Each of the service’s groundings spells out how the data types and results of the exchange), and money. The basic parameters/roles messages, required by the service model, are to be formatted and of this frame, then, will include Money, the Goods (standing for communicated in using a specific instance of the service. Taken to- goods or services), the Buyer (the person who surrenders money gether, the ServiceModel and ServiceGrounding object associated in exchange for the goods), and the Seller (the person who surren- with a Service will give enough information for an agent to make ders the goods in exchange for the money). Further elaborations, use of a newly discovered service. needed for describing some of the peripheral terms in this frame, The service profile summarizes the service model, by giving an involve certain details of the exchange: in some cases, for exam- external view of it (inputs, outputs, preconditions, postconditions), ple, we need to identify the Price, a ratio between the quantity of and may also provide additional information needed to determine money given and the quantity of goods received (e.g. two dollars the appropriateness of the service to a potential user. Generally an ounce), temporal features of the exchange (perhaps the payment speaking, a potential user of a service (or a matchmaker for the is spread over a period of time), the difference between the tender user) examines the service’s service profile to determine if the user and the price (Change), and so on. Still further elaborations can can provide the service’s inputs and make use of its outputs, and separate the owner of the goods or the owner of the money from examines the service’s groundings to determine if it is competent the actual participants in the exchange arrangement. to interact with the service provider (that is, knows how to use one We have been developing just such a parameterized model of the or more of those service groundings). structure of events and processes, and it can support tools for agents In the following section, we discuss how a process-based service to enter, modify, advertise and communicate the capabilities, types model can facilitate the use of interaction sequences for a wide va- and details of events they can observe, monitor, model and control. riety of transaction-based services, such as those typical of B2C The abstract theory of event structure has three major components1. web sites. 1. Temporal structure and evolution trajectories: The fine- 5. MODELING SERVICE PROCESSES structure of events comprises of a set of key states (such as Much of the Web-based ontological content (developed using enabled, ready, ongoing, done, suspended, canceled, stopped, DAML, OIL and other languages) is based on static attribute-value aborted) and a partially ordered directed graph that repre- feature structures arranged in taxonomies. This representation is sents possible evolution trajectories as transitions between ideal for describing and representing complex objects (nouns) such key states (such as prepare, start, interrupt, finish, cancel, as documents, organizations, or bibliographies in a manner that abort, iterate, resume, restart). Each of these transitions may supports structured queries such as the first three of our example be atomic, timed, stochastic or hierarchical (with a recur- queries. sively embedded event-structure). A significant missing piece in the currently available ontolog- 2. Process primitives and event construals: Events may be ical content is the representation of the dynamic content of ser- vices (procedures, protocols, behaviors and transactions). Much of punctual, durative, (a)telic, (a)periodic, (un)controllable, re- versable, (ir)reversable, ballistic or continuous. Thay may the Web is about providing and using services such as reserving a ticket, interacting with a secure site, buying/selling/trading, and 1 For further details, comparisons with hybrid system (dis- making an appointment with a doctor. Services are distinguished crete/continuous) models [4] and an extended treatment of the rep- from objects in that they require representing changes in the state of resentation and semantics of verbs and events, see [9, 10]. describe their resource interactions through relations such as locking, producing, creating, transforming, consuming or destroying. They may support defeasable construal opera- tions of shifting granularity (elaboration (zoom-in), collapse (zoom-out)) and enable focus, profiling and framing of spe- cific parts and participants. 3. Inter-event relations: A rich theory of inter-event relations allows sequential and concurrent enabling, disabling, or mod- ifying relations. Examples include interrupting, starting, re- suming, canceling, aborting or terminating relations. In each A theory of events has to link to a theory of time. We have been of these cases, we have a precise semantics in terms of the developing a fairly rich theory of time [1, 11] that includes a theory overall structure of the interacting events. of time intervals and a theory of time points with varying densities (qualitative, integer, reals, etc.). Our DAML encoding of the start 5.2 A DAML Encoding of Processes time of a process may thus use the point theory while the duration We have a preliminary DAML implementation of many aspects of a process may be specified using the interval theory. of our core theory of processes and events. We are developing a core DAML ontology of services in a larger collaborative effort with Stanford KSL[7, 8], CMU, BBN, and Nokia as part of the Start time for the process DAML service specification language effort. Some aspects of this core ontology are described below. The basic class that implements process-specific information is appropriately termed the EVENT class. Event specific information includes parameters such as the name of the event, where it is to Duration of the process execute, documents updated, read or written as part of execution. These properties obviously refer to noun ontologies like name on- tologies or document ontologies (developed by us or other ontology developers). Processes are complex events that have additional properties with respect to the ordering and conditional execution of individual events. The attempt here is to come up with a minimal set of process classes A simple event that can be specialized to specify a variety of web services. Processes have a name, inputs, outputs, participants, parame- A process is a composite event. ters, preconditions, and effects. Each input, output, parameter, pre- condition or effect is a property of event left unrestricted at this level (it ranges over ”Thing”). Collections of input, output, etc are are unrestricted bags (items are anything). Shown below is an il- lustrative sample from our existing ontology. Complete versions There are two fundamental relations between events and pro- can be found at http://www.ai.sri.com/ontologies/ cesses. One pertains to expanding an event to its underlying process services/Process.daml. (zoom-in) and the other corresponds to collapsing a process into an atomic event (zoom-out). Often we want to define operations on multisets of processes (buying involves simultaneous transfers of money (from credit card bank to seller) and goods (from seller to buyer). To do this we de- fine a PROCESS BAG class that is a subclass of the RDF (http: //www.w3.org/TR/REC-rdf-syntax) BAG class. A multiset of Events Parameters are further typed into input, output and participants. in DAML, enabling a DAML query language to find web sites with specified capabilities. These are, we believe, some of the most crit- ical components of the language required for enabling the Semantic Web. 7. REFERENCES Now we are able to define more complicated assemblages of pro- [1] M. Clavel, F. Durán, S. Eker, P. Lincoln, N. Martí-Oliet, cesses. For instance, SEQUENCE is a process that is comprised of a J. Meseguer, and J. Quesada. Maude: Specification and list of subprocesses, while CONCURRENT is a process which com- Programming in Rewriting Logic. SRI International, prises a multiset of subprocesses. Computer Science Laboratory, Menlo Park, CA, January 1999. http://maude.csl.sri.com/manual/. [2] C. C. Green. Application of theorem proving to problem solving. In Proceedings of the International Joint Conference on Artificial Intelligence, pages 219–239, May 1969. [3] J. Heflin and J. Hendler. Searching the web with shoe. In Artificial Intelligence for Web Search. Papers from the AAAI Workshop., number WS-00-01, Menlo Park, CA, US, 2000. AAAI Press. [4] T. A. Henzinger. The theory of hybrid automata. In LICS 96 We have also implemented an algorithm that can recursively con- (expanded version), pages 278–292, 1998. struct executable process models of the type described in (Narayanan [5] S. Luke, L. Spector, D. Rager, and J. Hendler. 1999) given DAML descriptions of the dynamic content of ser- Ontology-based web agents. In Proceedings of the First vices. Such models can be constructed automatically by agents International Conference on Autonomous Agents (Agents97), that encounter specific services. Agents can then use these mod- New York, NY, US, 1997. Association for Computing els to track execution of service requests and responses, monitor Machinery. requests and task performance and to plan and schedule specific [6] Z. Manna and R. Waldinger. A deductive approach to service related tasks. While the details of the construction algo- program synthesis. ACM Transactions on Programming, rithm and the process modeling environment are outside the scope Languages, and Systems, 2:90–121, 1980. of this paper, the reader can obtain further details from http: [7] S. McIlraith, T. Son, and Z. H. Mobilizing the semantic web //www.ai.sri.com/daml/services. with daml-enabled web services. In Proceedings of the A rich and structured core theory of service and an automatic Second International Workshop on the Semantic Web construction algorithm that compiles into an executable model should (SemWeb’2001), May 2001. provide us with motivated constraints to design interfaces that allow [8] S. McIlraith, T. Son, and Z. H. Semantic web services. In service providers to describe their services at a high level using do- IEEE Intelligent Systems, March/April 2001. main specific vocabulary. We envison this interaction environment [9] S. Narayanan. Knowledge-based Action Representations for to be a guided core-theory based graphical and natural language Metaphor and Aspect (KARMA). PhD thesis, Computer dialog. Building such a semantically grounded service authoring Science Division, EECS Department, University of environment is a current focus of our work. California at Berkeley, 1997. In summary, an ongoing project addresses a correlated set of es- [10] S. Narayanan. Reasoning about actions in narrative sential missing components in the current state of the Web: theo- understanding. In Proceedings of the Sixteenth International ries, languages and authoring environments for describing the dy- Joint Conference on Artificial Intelligence (IJCAI-99). namic content of services. Such models require a rich ontology and Morgan Kaufmann Press, 1999. theory of events and processes and will be crucial in enabling and [11] M. Stickel, R. Waldinger, and V. Chaudhri. A Guide to coordinating the activities of autonomous agents. This holds true SNARK. SRI International, Artificial Intelligence Center, regardless of whether the underlying context is one of controlling Menlo Park, CA, 2000. devices, carrying out complex tasks, or obtaining information. In- http://ai.sri.com/snark/tutorial.html. deed, in a Web enabled for intelligent agents, all of these contexts will become intertwined. 5.3 Status of Service/Process Ontology We are developing DAML declarations for the fundamental classes and properties mentioned above for describing services and pro- cess. The latest versions of these may be found at the URL http://www.ai.sri.com/daml/services. 6. CONCLUSION We have presented our vision of a DAML-enabled search archi- tecture, describing several scenarios for how queries will be pro- cessed amd identifying the main software components necessary to facilitate the search. We examined the issue of inference in search, and we address the issue of characterizing procedures and services