Towards Ontology-Based Yellow Page Services Mikko Laukkanen TeliaSonera Finland P.O. Box 970 (Teollisuuskatu 13), FIN-00051 SONERA mikko.laukkanen@teliasonera.com Kim Viljanen, Mikko Apiola, Petri Lindgren, and Eero Hyvönen Helsinki Institute for Information Technology (HIIT), University of Helsinki P.O. Box 26 (Teollisuuskatu 23), 00014 University of Helsinki, Finland firstname.lastname@cs.helsinki.fi Abstract End-user Web services Service Provider This paper discusses the possibilities of the Semantic Web technologies in both annotating services and deliver- IWebS Annotation Service Knowledge Editor Finder ing relevant services to end-users. We propose an ontology- base based mechanism for both advertising and finding the ser- vices. The essential parts of the system are ontologies for describing and storing service advertisements, a semantic Figure 1. The general architecture of the service finder for the end-user, and a semantic service an- IWebS system notation editor for service providers. 1 Introduction Yellow page directory services1 on the Web are a widely the service providers—to use the terms and concepts that used business concept for helping people to find companies they are familiar with. These concepts are then mapped to providing services and selling products. Despite of the ver- the ontologies within the system. The general architecture satility of possibilities, it can still be difficult for the end- of the IWebS system is depicted in Figure 1. The essential user to map a need to the services offered [1, 2, 3]. On the parts of the system are ontologies for describing and stor- other hand, for the service provider, it may be difficult to ing the service advertisements (the IWebS knowledge base), index the service in such a way that the end-users would a semantic service finder for matching the services for the not miss the service. The problems with yellow page ser- end-user, and a semantic service annotation editor for the vices arise in situations, where the end-user is not able to service providers. precisely state what kind of service would serve her needs. The work presented in this paper represents the ongoing This paper is organized as follows. In Section 2 we give work of IWebS (Intelligent Web Services) project2 , which some background information to the problem area of map- studies the possibilities of the Semantic Web [4] and Web ping the end-user’s need to a service. Section 3 describes Services [5] technologies in both annotating the services a scenario for motivating the need for the IWebS system. and delivering the relevant services to the end-users. We In Section 4 we discuss how the end-users search and find propose an ontology-based mechanism for both advertising services using the IWebS system. Section 5 describes the and finding the services. The idea is to let the various ac- ontologies used within the IWebS system, and explains the tors in the IWebS system—in this case the end-users and means for annotating new services. In Section 6 we ad- 1 e.g., http://www.yell.co.uk dress issues that are not covered in the current version of 2 http://www.cs.helsinki.fi/group/iwebs/ the IWebS system. Finally, Section 7 concludes this paper. 2 Background Helsinki Evening Online yellow page services are a widely used service target location model for matching the need of a end-user with the corre- sponding products and services offered by companies. The target time ? business idea of yellow pages is based on helping end-users to find services as easily as possible, and to provide the ad- vertising companies with a very targeted marketing media ? goal on end-users that are trying to find companies for a specific need. Typical online yellow page service provides the user with Figure 2. A query to the IWebS system. the keyword-based search and hierarchical or flat-list navi- gation [3]. In the keyword-based search the end-user locates services by just typing in a few keywords to a search engine. context in which the user and the services are. The end-user does not have to figure out which categories in the yellow pages may be relevant from the viewpoint of her need. However, the end-user needs to know the rele- 3 Usage Scenario vant keywords. Also, the matched document does not nec- essarely prove to be relevant, for instance, if the keyword In the scenario an end-user named Cathy is attending a was stamp and the retrieved document contains the phrase conference, where she got the chance to get in the same pic- “We do not sell stamps, but...”. Also, a textual description ture with a famous invited speaker. At the end of the con- is found only if it contains the explicit keyword. For exam- ference day Cathy wants to get the digital picture printed on ple, one may be interested in companies dealing with as- a t-shirt, which she will keep as a souvenir from the confer- tronomy. A telescope advertisement is not found unless it ence. happens to mention the word astronomy, which may be too Cathy has the evening off, so she would like to get the obvious to be mentioned. picture printed right a way. However, Cathy does not have In the case of hierarchical or flat-list navigation a typical any idea of which service provider (e.g., a shop or a print yellow page service provider maintains a list or a hierar- house) could do the job for her. Therefore, she activates her chy of product and service categories, such as “Electronic smartphone, and presents the problem to the IWebS sys- equipment” or “Car Rental”. All advertisements are then tem. Cathy speficies that the related objects for the query placed under one or several categories to help the user to are the picture and a t-shirt. The service needs to be avail- find the services. Based on the categorization, the user can able within the Helsinki downtown area, and it has to be navigate to the category that best fits the user’s intentions. open at the evening. Cathy is not familiar of what the pro- However, from the viewpoint of the user’s need, indexing cedure (i.e., the goal) of getting the picture onto a t-shirt is a company’s advertisements according to a business cate- called, thus, she leaves that for the IWebS system to find gory, such as the aforementioned product and service, is not out. The information Cathy specifies in her query is visual- very useful unless the category unambiguously implies the ized in Figure 2. services offered to the user. For example, a camera repair The IWebS system processed the query, matches the service can potentially be offered by an importer company, service providers, and returns a list of relevant service appliances shop, camera shop, photo shop, or an optician. providers to Cathy together with information about the ser- What kind of companies offer repair services depends for vices themselves, their opening times, and directions on instance on the service business at hand, on the thing and how to get there. In this case the list contains the follow- brand being repaired, and on local practices. In order to en- ing service providers: hance the search capabilities of yellow pages the advertising companies should more clearly state what services they ac- • A nearby shop selling t-shirts and a print house. In this tually offer. case Cathy needs to first buy the t-shirt, and then let The IWebS project has been launched to investigate the the print house print the picture on it. possibilities the Semantic Web and the Web Service tech- nologies offer for creating more effective methods for find- • A photo shop and a specialized stall on a market. The ing real-world services. The goal of IWebS is to create an market stall happens to be available in Helsinki on that intelligent yellow pages service, where the semantically an- week, and it sells t-shirts with any picture printed on notated services cover both static and dynamic advertise- them. But before this, Cathy needs to go to the photo ments, whose availability to the end-user depends on the shop to have the picture developed. 2 • An art shop, a nearby shop selling t-shirts, and a photo be queried by the systems. This is also a weakness, since the shop. The art shop sells the required equipment for natural language understanding can be difficult and error- printing the picture on a t-shirt by oneself. To do that, prone. The OntoSeek uses lexical conceptual graphs to Cathy needs to buy the t-shirt and develop the picture. present the queries, but since the vocabulary and the rela- tions are unconstrained, the graphs can not be validated and From these, Cathy chooses the market stall, because after hence the queries can be unsound [1]. the printing, she has some time to wander at the market. In our work, we propose using restricted terms and rela- The scenario so far has assumed that there indeed ex- tions, described by ontologies, for making the queries and ists services annotated in the IWebS system. We will now for describing the available services. If the right terms and extend the scenario to show how the dynamically available relations are found, this helps both the end-user and the ad- services, such as the moving market stall, are added and up- vertisers in describing their needs and offerings. In addition, dated to the system. using ontologies the user interfaces can be built in such a A moving market stall holder Mick is planning to visit way that they help and direct the user’s action towards se- Helsinki this week to do some business. He has added an mantically sound results, such as in the MuseumFinland [7] advertisement to the IWebS system earlier, and likes to up- system. date his service offering definitions. He activates his Palm device, and changes all the services he offers (selling t- As a first step we have tested the Museum Finland frame- shirts, souvenirs, and hot dogs) to be available in Helsinki, work [7] in the IWebS context. The Museum Finland’s user and provides the exact location and opening time informa- interface is based on the idea of a view-based search [8] tion. He also adds a new service—printing pictures on t- where the user can make multiple selections from differ- shirts—by annotating it to be related to terms ”print”, ”pic- ent views on the underlying content, presented in RDF(S) ture” and ”t-shirt”. In addition, the IWebS system suggests The views can be presented as tree-structured categoriza- that the new service should relate to ”Personal appearance” tions. The user can make queries to the underlying con- and ”Refresh / entertainment” based on other similar anno- tent by making selections (restrictions) using one or sev- tations. Mick agrees, and commits the changes to the IWebS eral views. The result of the query is those resources that system. matches all the restrictions. In Museum Finland the view- based search has been extended by keyword-based search, which provides an additional way to define restrictions to 4 Searching for the Services the query. The outcome of the test was that the Museum Finland- In a general user driven information retrieval system the based IWebS system made it possible to do view-based user’s input can be collected implicitly (user’s context and search on the advertises imported to the underlying knowl- profile), explicitly by keywords typed by the user, or ex- edge base. In this first step we used only two views (service plicitly by navigation-based input. In our case, based on the categorization and location), of which the service catego- input from the user, the system must be able to present the rization is too difficult for end-users to use. In the follow- user’s problem (e.g., the Cathy’s problem in Figure 2) in ing, we will describe more user-friendly ways of represent- such a format that the problem can be solved by the avail- ing the query. able services. OntoSeek [1] provides the user with a natural lan- guage interface where the user can describe her problem 5 Service Ontologies and Annotation using arbitary natural language terms and describe rela- tions between them as lexical conceptual graphs which re- Service metadata is needed for ontology-based search to sembles the figure 2. OntoSeek uses ontologies such as function efficiently. We argue that creating semantically the WordNet[6] for expanding the queries with, e.g., syn- correct metadata is a fundamental problem of the Seman- onymes, which helps to match the queries with the nat- tic Web. The creation of metadata can be done either auto- ural language advertisements. The YPA system provides matically or manually. Automatic annotation usually means the user with a natural language search to yellow page ad- processing large amounts of existing data using natural lan- vertisements [2]. The system uses natural language pro- guage processing and data mining techniques. Depending cessing and information retrieval technologies for searching on the data, the automatic annotation can be too compli- the semi-structured advertisements. With YPA the user can cated task for computers, and some help from a human user make questions like “I need to get my camera repaired!” is required. Within the IWebS system, the most relevant which are answered based on the advertisements and the problems associate to manual annotation; how to get the world model (the WordNet[6]). best possible annotation with a minimal effort from the user, The strength of both the OntoSeek and YPA system is how to automatize the process to its full potential, and how that any collection of natural language advertisements can to validate the annotation. 3 5.1 Describing the Services using Ontologies Service Provider hasLocation COICOP Field of Location hasTarget Traditional yellow page services are classified from one Life point of view as described in Section 2. We are interested hasLocation in describing the services as processes, which have goals, hasGoal Service hasTOL Goal TOL targets (e.g., t-shirt in Cathy’s problem) and take place in Offering time and location. The services in the IWebS system are hasOpeningTimes described using a set of ontologies, which are goal, target, service provider, service offering, Standard Industrial Clas- Time sification (TOL) [9], Classification of Individual Consump- tion by Purpose (COICOP) [10], time, and location. The goal and target ontologies are targeted for specifying the Figure 3. The ontologies and their relation- end-users’ needs. The service provider, service offering, ships within the IWebS system and the classification ontologies are used for describing the service offerings. The goal ontology consists of abstract concepts, which 5.2 Using an Annotation Editor for Creating Ser- express activities such as Alter, Copy, Create, Erase, and vice Annotations Move. The terms in the goal ontology imply the abstract meaning of several domain specific terms, and aim at giv- ing the user with means to query the services by a com- There exists a range of annotation editors such as the mon sense. Thus, the user does not have to know any do- Annotea [12], the SHOE Knowledge Annotator [13], the main specific terms, when querying services from the goal AeroDAML [14], the MnM [15], and the OntoMat [16]. viewpoint. Very similar terms can be found from the exist- In Annotea [12], the annotation means attaching web ing Process ontology in Standard Upper Merged Ontology pages with users “comments” such as advices, change sug- (SUMO) [11], which could perhaps be translated to Finnish gestions or opinions about the page. Although the anno- and used as a goal ontology in the IWebS system. tation RDF-schema in Annotea can be extended by users, it does not support the use of multiple ontologies, as The product and the “fields of life” ontologies are used needed in, e.g., the IWebS system. Annotea is based on for describing the targets of the service offerings. The prod- a document-centric approach, where the users are browsing uct viewpoint is defined by the COICOP. The top level of documents and examining annotations related to them. The the fields of life ontology consists of Home, Work, Health, annotations are not intended for helping find the data. Education, Refresh/Entertainment, Personal Appearance, SHOE Annotator [13] aims at annotating Web pages by Capital, Food and Supplies and Social Interactivities. linking them to ontologies using its own SHOE language. An instance of a class in the service provider ontology The annotations are then collected to a server and used for can be anything that is able to provide one or more service. finding the pages easier. SHOE does not support RDF. The services in turn are modeled as service offerings. For AeroDAML is a Web service, which automatically anno- instance, a barber shop is a service provider, which has two tates a given Web page using a given ontology with the help service offerings: making haircuts and selling hair lacquers. of WordNet [14]. The MnM [15] and the OntoMat [16] are aimed at solv- All ontologies are presented in OWL-format. The time ing problems of automatic annotation. They include fea- ontology was created by our own, and it is used to present tures such as extraction of text phrases from documents— for instance the opening times. The location ontology was automatically and semi-automatically—using techniques imported from the Museum Finland project [7]. such as natural language processing. The ontologies are bound together using properties. Fig- All of the editors mentioned above are hard to use for ure 3 depicts these bindings. The service provider has one persons with minimal skills in computer usage, and who or more service offerings, and the service provider is lo- are not familiar with ontological concepts nor the problems cated at some location. The service offering has also a lo- of annotation. In our case, the editor should be easy to use cation, which can be different from the one of its service for users interested in describing their services but not in- provider. Furthermore, the service offering is classified us- terested in technical details about annotating. ing the TOL. Finally, the service offering may have a goal, Our goal is to develop a user friendly annotation edi- and it is targeted at either an instance in COICOP or in the tor, which guides the annotator to make correct annotations “field of life” ontology. based on the ontologies. One possibility is to provide the 4 External Existing IWebS 5.3 Importing Instance Data Information ontologies from knowledge base Sources the Internet (databases and For importing the instance data from existing databases classifications) or other information sources we have built a publication pipe (See Figure 4), with which we have translated ser- DB-to-XML Validation vice provider and service offering data stored in a legacy database into corresponding OWL instances. The service provider annotations cover over 200 000 advertisements representing service providers all across Finland. The publication pipe operates in three subsequent phases. In the first phase the data is encoded into XML. XML-to-OWL The second phase translates the XML-encoded data into OWL language. This phase is the most important, and re- quires data-specific translators. The output of this phase is Self-made ontologies the OWL ontology (classes and properties) and the actual instances representing the original data. In the third phase the generated OWL ontology is validated. If the final val- Figure 4. Importing instance data into IWebS idation phase passes, the data is correctly transformed into OWL, and is usable by the IWebS system. 6 Future Work annotator with a multi-view-based user interface, which re- The current version of the IWebS system provides both stricts the choices in the annotation during the annotation keyword and navigation-based user interface for querying process. By a multi-view-based user interface we mean an services. In the future we are improving the query interface interface similar to the search interface described in Sec- so that the end-user does not have to know explicitly what tion 4. For example, the annotator can start the annota- she is looking for. We are aiming at a solution, where the tion by classifying the service to some inland location in end-user only needs to express her problem to the IWebS the location ontology. Then, the other ontologies will be system, which in turn infers what kind of services could restricted so that the system guides the annotator to a rea- solve the problem. sonable annotation. In this situation, a service classification We are also interested in dynamic content, whose avail- ontology would be restricted so that it would not be pos- ability to the end-user depends on the contexts where both sible to annotate the service to “waterborne-traffic”-class, the end-user and the service provider are. The service since the location is (based on the ontological knowledge) providers are given the possibility to update their service far away from water. profile on the fly. The update should be done either by hand using the annotation editor, or automatically using Web ser- Based on the ontologies, annotation recommendations vices, which integrate the service providers’ legacy systems could be created suggesting services that the annotator into the IWebS system (see Figure 1). In doing so, for in- would offer. Recommendations could be created based on stance a barber shop could advertise a happy hour with dis- ontological rules and existing annotations. For example, a counted prices in a ad hoc manner. user annotating her service as a barber shop could be asked, The IWebS system is intentend to be available both for if her shop also sells hair lacquers or other stuff. stationary (desktop) and mobile users. Currently only the former case is supported. The mobile devices range from After the initial service annotation, the updates to the low-end mobile phones to high-end personal digital assis- annotation can be done either by a human end-user (i.e., tants (PDA) and smartphones. Thus, the user interfaces for the service provider), or by a legacy system of the service the IWebS system needs to range from mobile phones to a provider. For instance, for a small flower shop owner it is full-blown Web (or XForms [17]) browser. easier to use a Web-based tool to edit the annotation for the Finally, since the quality of the data in yellow page ser- shop. However, medium or large companies, such as restau- vices is higher than the data in the Web and, on the other rants or gas station chains, could integrate their legacy sys- hand, the companies public Web pages contain typically tems to automatically keep the annotation up-to-date. This more information than the advertisements in the yellow can be done by using the Web service interface to the anno- pages, the yellow page data could perhaps be used as a boot- tation editor (see Figure 1). strap data for a domain specific internet search engine that 5 would index advertisers Web pages. This would combine [4] T. Berners-Lee, J. Hendler, and O. Lassila, “The Se- the benefits of the closed, high quality service advertisement mantic Web,” Scientific American, vol. 284, no. 5, pp. registry with the greater variety of information published on 34–43, May 2001. the Web by the advertisers. [5] D. Booth, H. Haas, F. McCabe, M. Champion, C. Ferris, E. Newcomer, and D. Orchard, “Web Ser- 7 Conclusion vices Architecture,” Aug. 2003, W3C Working Draft 8, available at: http://www.w3.org/TR/2003/WD-ws- In this paper we introduced the work being done in the arch-20030808/. IWebS project, which studies the possibilities of the Seman- [6] C. Fellbaum, Ed., WordNet: An Electronic Lexical tic Web and Web Services technologies in both annotating Database. The MIT Press, May 1998, iSBN 0-262- the services and delivering the relevant services to the end- 06197-X. users. The IWebS system differs from other online yellow page services in that it utilizes ontologies in both queries [7] E. Hyvönen, M. Junnila, S. Kettula, E. Mäkelä, and service annotations. The baseline idea is to let the end- S. Saarela, M. Salminen, A. Syreeni, A. Valo, and user and the service provider to use the terms and concepts K. Viljanen, “Finnish Museums on the Semantic Web. that they are familiar with. These concepts are mapped to User’s Perspective on MuseumFinland,” in Museums the ontologies within the system. The essential parts of the and the Web 2004 (MW2004), Arlington, Virginia, system are ontologies for describing and storing the ser- USA, Mar. 2004. vice advertisements, semantic service finder for matching [8] A. S. Pollitt, “The Key Role of Classification and the services for the end-user, and semantic service annota- Indexing in View-Based Searching,” University of tion editor for the service providers. Huddersfield, UK, Tech. Rep., 1998, available at: The current prototype of the IWebS system is based on http://www.ifla.org/IV/ifla63/63polst.pdf. the Museum Finland framework [7]. The Museum Fin- land’s user interface is based on the idea of view-based [9] Statistics Finland, Standard Industrial Classi- search [8] where the user can make multiple selections fication TOL 2002. Helsinki: Valopaino, from different views on the underlying content. The view- 2002, ISBN 952-467-097-6, Available at: based search has been extended by keyword-based search, http://www.stat.fi/tk/tt/luokitukset/index talous keh which provides an additional way to define restrictions to en.html. the query. [10] United Nations, Statistics Division, Classifica- tion of Individual Consumption by Purpose Acknowledgements (COICOP), New York, USA, 1999, Available at: http://unstats.un.org/unsd/cr/registry/regcst.asp?Cl=5 &Lg=1. This work was funded by the National Technology Agency Tekes, Fonecta, TeliaSonera, and TietoEnator. [11] I. Niles and A. Pease, “Towards a Standard Upper On- tology,” in The Proceedings of the 2nd International Conference on Formal Ontology in Information Sys- References tems (FOIS-2001), 2001. [1] N. Guarino, C. Masolo, and G. Vetere, “OntoSeek: [12] J. Kahan, M. Koivunen, E. Prud’Hommeaux, and Content-Based Access to the Web,” IEEE Intelligent R.Swick, “Annotea: Open RDF Infrastructure for Systems, pp. 70–80, May/June 1999. Shared Web Annotations,” in The Proceedings of the WWW10 International Conference, 2001. [2] A. De Roeck, U. Kruschwitz, P. Neal, P. Scott, [13] J. Heflin and J. Hendler, “A Portrait of the Seman- S. Steel, R. Turner, and N. Webb, “YPA - an tic Web in Action,” IEEE Intelligent Systems, vol. 16, intelligent directory enquiry assistant,” BT Technology no. 2, 2001. Journal, vol. 16, no. 3, pp. 145–155, 1998. [Online]. Available: citeseer.ist.psu.edu/roeck98ypa.html [14] P. Kogut and W. Holmes, “AeroDAML: Applying In- formation Extraction to Generate DAML Annotations [3] E. Hyvönen, K. Viljanen, and A. Hätinen, “Yellow from Web Pages,” in The First International Confer- Pages on the Semantic Web,” in Towards the Semantic ence on Knowledge Capture (K-CAP 2001). Workshop Web and Web Services, the Proceedings of XML Fin- on Knowledge Markup and Semantic Annotation, Vic- land 2002 Conference, 2002, pp. 3–14. toria, B.C., Canada, Oct. 2001. 6 [15] M. Vargas-Vera, E. Motta, J. Domingue, M. Lanzoni, A. Stutt, and F. Ciravegna, “MnM: Ontology Driven Semi-Automatic and Automatic Support for Semantic Markup,” in The Proceedings of the 13th International Conference on Knowledge Engineering and Manage- ment (EKAW 2002) , A. Gomez-Perez, Ed. Springer Verlag, 2002. [16] S. Handschuh, S. Staab, and A. Maedche, “CREAM— Creating Relational Metadata with a Component- Based, Ontology-Driven Annotation Framework,” in The First International Conference on Knowledge Capture (K-CAP 2001), Victoria, B.C., Canada, 2001. [17] M. Dubinko, J. Leigh L. Klotz, R. Mer- rick, and T. V. Raman, “XForms 1.0,” Oct. 2003, W3C Recommendation, available at: http://www.w3.org/TR/2003/REC-xforms- 20031014/. 7