zLinks: Semantic Framework for Invoking Contextual Linked Data Michael K. Bergman Frédérick Giasson Zitgist LLC Zitgist LLC Coralville, IA USA Quebec City, Quebec Canada mike@zitgist.com fred@zitgist.com ABSTRACT positioned at its core [3]. This first-ever demonstration of the new zLinks plug-in shows However, the Linked Data community readily acknowledges that how any existing Web document link can be automatically the existing semantics and basis for relating compliant datasets transformed into a portal to relevant Linked Data. Each existing are relatively poor. Moreover, there presently are no techniques link disambiguates to its contextual and relevant subject concept or methods for relating non-Linked Data to the rapidly growing (SC) or named entity (NE). The SCs are grounded in the storehouse of LOD-compliant datasets. OpenCyc knowledge base, supplemented by aliases and WordNet synsets to aid disambiguation. The NEs are drawn from The newest release of the zLinks plug-in (see Figure 1) and its Wikipedia as processed via YAGO, and other online fact-based supporting server-side infrastructure directly addresses the issues repositories. The UMBEL ontology basis to this framework of improved semantics for Linked Data matching and relating offers significant further advantages. The zLinks popup is Linked Data with standard Web content. Our demonstration invoked only as desired via unobtrusive user interface cues. shows how normal hyperlinks in standard WordPress blogs can be automatically related to contextually relevant Linked Data. Categories and Subject Descriptors This précis first describes the technical underpinnings to zLinks’ H.3.3 [Information Search and Retrieval]: Information filtering, semantic framework, then overviews the application and possible Query formulation; H.5.4 [Hypertext/Hypermedia]: Navigation; future directions. H.5.2 [User Interfaces]: Interaction styles. 2. SEMANTIC FRAMEWORK W-O-W-Y is the term we apply to the semantic framework for Keywords relating a given hypertext link to its relevant Linked Data. The demo, Linked Data, zLinks, OpenCyc, Wikipedia, WordNet, term is derived from the constituent resources of WordNet2 (W), YAGO, UMBEL. OpenCyc3 (O), Wikipedia4 (W) and YAGO [4] (Y). Via the WOWY framework, we first determine if the link refers to either 1. INTRODUCTION a named entity (NE) or a subject concept (SC) as well as to Linked Data [1] follows recommended practices for identifying, disambiguate alternate senses. If an NE, the entity is also related exposing and connecting data on the semantic Web. A robust to its parent subject concept; every link thus has a SC basis. Linked Open Data (LOD)1 community has rapidly developed around the practice with literally billions of compliant data items All canonical SCs are embedded in a subject structure ontology – now available. the “backbone”. Use of this ontology brings inference and other relationship advantages. These various semantic frameworks are A notable catalyst to the Linked Data movement has been described below. DBpedia [2], which exposes Wikipedia data in best-practices format. It is appropriate that the flagship figure showing the 2.1 Subject Concepts interrelationships of many Linked Data sources has DBpedia Subject concepts (SCs) are the core constituents to the framework. All SCs are based on existing concepts in OpenCyc, the open Figure 1. zLinks toggle popup 2 1 http://wordnet.princeton.edu/ http://esw.w3.org/topic/SweoIG/TaskForces/Community 3 http://www.opencyc.org/ Projects/LinkingOpenData 4 Copyright is held by the author/owner(s). http://en.wikipedia.org LDOW2008, April 22, 2008, Beijing, China. source version of the Cyc [5] knowledge base. SCs are the there are multiple contextual overlays. The basic results concrete, non-abstract topic-related classes within Cyc. About paradigm is taken from Zitgist’s related DataViewer for RDF 22,000 of them were vetted from Cyc (paper in preparation). data.8 Aliases for these concepts as maintained by Cyc were combined with matching WordNet synsets to produce the SC 4. FUTURE DIRECTIONS disambiguation lexicon. This initial zLinks design is but a mere taste of the possibilities with Linked Data from the twin perspectives of additional 2.2 Named Entities relationships and presentation templates. We expect rapid The named entities (NE) are drawn from Wikipedia as processed developments in both areas. via YAGO, and other online fact-based repositories. NEs are the instances of the SC classes in the standard definition of the term5. A generalization of the plug-in architecture will enable extension NEs also have aliases for disambiguation purposes (such as the to other user-content platforms. Still further extending this many ways to refer to the “United States”). generalization to the Web browser would bring zLinks capabilities to every Internet user for all existing Web content. Each NE is mapped to a governing SC for ontology purposes. The basic zLinks design also lends itself to incorporating 2.3 UMBEL Ontology additional sources of named entity lookups and Linked Data. All of the SCs are expressed in the UMBEL (Upper-level Mapping and Binding Exchange Layer)6 ontology. UMBEL is a 5. CONCLUSIONS lightweight structure of subject concepts and their semantic Linked Data has been a triggering event in the nascent emergence relationships. There is a direct overlap of UMBEL subject of the semantic Web. We expect to see similar innovations to concepts to a subset of class concepts within OpenCyc. zLinks – such as Wikify9 – emerge to test out different paradigms and interfaces for how best to exploit a Web of Data. Quick relations can be determined from UMBEL for a given SC; more involved inferencing can be directed to OpenCyc. An older zLinks prototype may be found at http://zlinks.zitgist.com/; an updated demo based on the version Thus, via these semantic relationships, other relations such as herein will be posted shortly after the presentation. parent concepts, domains, various entity types, and similar relationships can be obtained once a given SC is identified. 6. REFERENCES 3. APPLICATION [1] T. Berners-Lee. Design Issues–Linked Data. Published online, May 2007. http://www.w3.org/ This semantic framework is applied on the server-side once a DesignIssues/LinkedData.html given standard link is processed. The client-side zLinks plug-in provides the user interface, initial link extraction and results [2] S. Auer, C. Bizer, J. Lehmann, G. Kobilarov, R. Cyganiak, reporting to the user. and Z. Ives. DBpedia: A nucleus for a web of open data. In Proceedings of the 6th International Semantic Web 3.1 Plug-in Design Conference and 2nd Asian Semantic Web Conference The demo is provided as a standard WordPress7 PHP plug-in, with (ISWC/ASWC2007), Busan, South Korea, volume 4825 of many options parameterized. LNCS, pages 715–728, November 2007. [3] R. Cyganiak. The Linking Open Data dataset cloud. Figure 3.2 Snippet Evaluation published and maintained online; version December 2007. A “snippet” of text based on a word window or sentence http://richard.cyganiak.de/2007/10/lod/ surrounding the target link is extracted, parsed, filtered and then submitted to the server for sense evaluation. We use a variant of [4] F. Suchanek, G. Kasneci, and G. Weikum. Yago: a core of a graph-based disambiguation algorithm suited for use with large semantic knowledge. In WWW ’07: Proceedings of the 16th knowledge lexicons [6]. international conference on World Wide Web, pages 697– 706, New York, NY, USA, 2007. This extraction process can also result in issued queries to standard Web search services. [5] D. B. Lenat. Cyc: A large-scale investment in knowledge infrastructure. Communications of the ACM 38, no. 11, 3.3 User Cues November 1995. The semantic result of this link evaluation – the individual zLink [6] R. Mihalcea and A. Csomai. Wikify! Linking documents to – is presented to the user via a subtle, small icon. The results are encyclopedic knowledge. In Proceedings of the Sixteenth only presented to the user after a mouseover with set delay, to ACM conference on Conference on Information and ensure the popup is purposefully desired and unobtrusive. Knowledge Management (CIKM '07) , pp. 233-241, November 6-8, 2007. 3.4 Popup A popup presents the contextual zLinks results (see Figure 1), with toggle views presented as substitution overlays to preserve screen real estate. Because of the multiple relations possible, 5 See http://en.wikipedia.org/wiki/Named_entity_recognition 6 8 http://www.umbel.org http://dataviewer.zitgist.com 7 9 http://www.wordpress.org http://www.wikifyer.com/