Creating Semantic Mind Maps from Linked Data with AutoMind Creator Csaba Veres The University of Bergen, Norway csaba.veres@uib.no ABSTRACT explore linked data and to export interesting summaries of AutoMind Creator1 is an iOS application that lets users in- that data in a useful form. teract with linked data to produce customized views. These can be exported as graphical visualizations we call Semantic The remainder of this paper explains the design behind the Mind Maps, as well as rich text (RTF), outline (OPML) and application, including the rationale of semantic mind maps, Freemind. We present a new technique for linked data vi- and discusses some issues with implementation. sualisation called Semantic Mind Maps which are rich Mind Maps whose nodes are semantically grounded with a defining URI. The maps are essentially a compact knowledge repre- 2. BACKGROUND sentation format from which users can further explore infor- Mind Mapping is a freeform diagramming technique for in- mation of interest. This paper describes the implementation tuitively capturing key concepts in a domain. It is perhaps of AutoMind, and highlights some particular pitfalls in pro- the simplest concept diagramming technique, consisting of gramming for a commercial application with linked data, a single central concept from which sub concepts radiate in especially on the Apple ecosystem. independent tree structures. Each branch is labelled with a keyword or image. It is possible to embellish mind maps Keywords with additional features, especially if one uses mind mapping software. A typical addition is to highlight concepts with linked data, mind maps, iOS, Apple, visualization, knowl- colour, font, shape, or any number of demarcating features. edge discovery, information space While these additions do not have a formal semantics, they can take on idiosyncratic interpretations to individual mind 1. INTRODUCTION mappers. In addition, colour and imagery can help with the Linked open data presents an exciting opportunity for turn- visual organisation of the concepts in a mind map [3]. The ing the Web of documents into a Web of data [1]. The lack of well defined semantics for the model components in- Linking Open Data Project has been involved in identify- dicate that mind maps are not so much a formal modelling ing existing open data sets that can be exposed as RDF. language, but rather a way to capture “brainstorming ses- Prominent current examples of such datasets are DBPedia, sions” in a concise, structured representation [2]. The visual W3CWordNet, and Geonames. Ideally, linked data descrip- components are designed for human comprehension rather tions should be machine readable and encompass a useful than formal interpretation. notion of semantics to enable the programming of knowl- edge rich applications. In spite of the lack of formal rigour, mind maps have proved useful in the software development process. For example In addition, the web of data also provides an opportunity Bia et. al. [5] used mind maps to model XML DTDs and for humans to find clear, unambiguous facts about topics of Schemas as sets of parallel trees and implemented XSLT interest. DBPedia, for example, summarizes the key facts transformations to generate FreeMind mind maps. The ad- about topics in WikiPedia pages. These facts could be very vantage of mind maps in representing the complex graph useful for humans if they had a suitable application that structures is that they enable the intuitive navigation of facilitated user friendly interaction with the data. The mo- the structures with selective hiding of sub branches. They tivation for AutoMind was to create a tool for humans to managed to successfully model, design and modify complex 1 https://goo.gl/OzAstw Schemas by constructing manageable, easily comprehensible diagrams. The goal with semantic mind maps was to enhance the basic mind map notation in a minimal way that would preserve its simplicity and user friendliness, but nevertheless add a level of semantic description to enrich the expressiveness and comprehension of the map. Eppler [4] notes that mind maps can become inconsistent and comprehensibility can suffer as the size of the mind map grows. Since links have no formal interpretation, concepts can become linked in idiosyncratic ways, and interpretation can suffer. Semantic mind maps are mind map is HTML and Javascript, it is not possible to di- designed to mitigate this problem by semantically grounding rectly share this on the typical social networks. Our solution the nodes and links of a mind map. is to upload the HTML to a private FTP server, and share a public link to that file. The source HTML can be saved Unfortunately, the creation of semantic mind maps is com- locally and edited or reused in any way. plicated by the need for grounding each concept; This is where linked data comes in, by providing a supply of grounded, The mind map is a rich medium for presenting information. inter related concepts. The goal of the present application The nodes can be clicked to expand or collapse, to highlight was to enable the exploration of domain general linked data relevant detail. Each node is defined by a grounding URL. with semantic mind maps. In particular we used DBPedia Double clicking the node (or long pressing on a touch device) as the primary data hub, and exploited several links to other opens a new window with the reference of defining URL. An data sources like Project Gutenberg, New York Times Open example map for the Van Gogh Museum can be retrieved at Data, and the CIA World Factbook. However, the approach http://csabaveres.net/VanGoghMuseum.html. can be extended to any data sources. AutoMind Creator is an exploration and visualization tool for linked data. The application also makes it possible to export the knowl- edge graph in a number of useful formats, currently re- 3. EXAMPLE USE CASE stricted to Outline Processor Markup Language (OPML), Rich Text Format (RTF) and FreeMind mind mapper fomat. A primary use case is for students or office workers who want OPML files can be opened with many outliner applications to produce a quick presentation on a topic, or to create a which present the facts in bullet list that could be used to rich basis from which to further develop the presentation. explain the topic in a clear sequential manner (figure 4(a)). For example, suppose a student had to make a presentation RTF is a portable document format which can be edited related to Van Gogh, and searches for the words in the first by most popular word processing and presentation software. screen of the application. The search returns a number of re- The RTF text is in a clear tabbed format and complete with sults related to Van Gogh, including some unexpected ones citations in the form of the URL links which makes an ideal like ”Van Gogh (1948 film), Theo Van Gogh (Art Dealer), basis for an essay (figure 4(b)). Alternatively the points can and Van Gogh Museum” (see figure 1). In order to differen- be easily transformed into a presentation which can also be tiate himself from his fellows the student selects ”Van Gogh enhanced with additional points. Finally the mind map can Museum”, which reveals the first property selection screen. be exported in FreeMind format, which is a popular open source mind mapping application that can be used to edit A typical set of properties for the resource Van Gogh Mu- and supplement the map generated by AutoMind. Our ap- seum is shown in figure 2(a). Selecting a row will initiate a proach to building the tool follows the UNIX philosophy segue to a new table which displays all the objects for the of combining ”small, targeted tools” to accomplish bigger selected predicate. Users can select which new resources to tasks, so we do not try to replicate editing capability which include in their mind map with the use of toggles, as shown is already provided by FreeMind. The intended workflow is in 2(b). In addition, when these resources are themselves the that users generate a rich but possibly incomplete mind map subject of another set of triples, then selecting the row will with our tool, and then enhance this in FreeMind by adding cause a transition to a new table displaying the properties nodes from sources not available in AutoMind. Since the which are relevant to that resource. One of these predicates base map is well structured and provides a coherent seman- can then be selected, transitioning to a new table with the tic framework, there is at least a possibility that additions objects of that predicate. The user can of course go back in will themselves be systematic and maintain the semantic in- the series of tables at any time. The forward or backward tegrity of the mind map. transitions can repeat ad infinitum. For example, the user can select ”Van Gogh Museum hasLocation Amsterdam”, but then on a further screen they could select ”Amsterdam 4. RESEARCH ISSUES birthPlace (of) Jaap Voigt”, and then ”Jaap Voigt subject AutoMind creator was primarily designed as a practical tool Dutch field hockey players”, then go back to select another to enable users to navigate, serialize and visualize linked property of ”Van Gogh Museum” and so on. data in a user friendly manner. A widespread adoption of the tool would provide a unique opportunity to study how The user can preview the mind map at any time during the users create personalized data structures from open linked construction process. A typical graph segment is shown in data. figure 3. The mind map is interactive in that selecting a node will reveal the hyperlinked resource. However, nodes in the We have some preliminary evidence that people find the mind map can not be moved, deleted or added. In order formalism useful in complex information processing tasks. to add or delete nodes the user must return to the selection In a currently unpublished masters thesis, a student per- mode and toggle switches to chose the desired resources. formed a case study in which she constructed a set of se- A future release might include the option to delete and re mantic mind maps that captured operations at an insurance order nodes. However, at this stage the intention is to leave company from different operational perspectives. One was a the creation of the nodes with the system of tables. More high level map of the company objectives and mission state- elaborate modification of the maps would be possible by ment. Another captured part of the sales process, and yet exporting to a more general mind mapping tool. another involved data warehouse facts. Thus each map was constructed with concepts that are meaningful to different The mind map can be shared at any time in several ways, target audiences who are typically not privy to each other’s including Twitter or email. Since the code generating the concerns. However, the nodes in each map were semanti- Figure 1: The search screen. (a) Chose property row (b) Chose property value with toggles Figure 2: The information selection screens cally annotated with concepts from a domain ontology that the goals. By clicking on the hyperlinks of any mind map, included concepts from every level of description. Nodes the users were directed to different mind maps in which the in different mind maps could be related through the ontol- related concepts appeared. This way managers could see ogy. For example, high level strategic goals were related to which sales processes were successful, and data warehouse the sales processes that support those goals, and to data people could see which strategic goal each fact impacted. warehouse facts that reported on the success of achieving The case study was very successful, and users generally Figure 3: A section of a mind map. (a) OPML export (b) RTF export Figure 4: Two export formats found the tool to be easy to use and a great help in un- the sales process recorded a new contract at the point an derstanding processes, especially when there were anomalies offer was made. However, the accounting process recorded in the process. For example, the mind maps revealed that the new contract at the point it was actually accepted and payment made. This revealed a previously mysterious gap opted for a freely available framework. There are a number between the reported number of new contracts and the in- of interesting Javascript visualization libraries that could be come from a particular division in the company. run inside an iOS WebView. After trying a number of them we settled on two of these for production use. We are hopeful that an intuitive way to exploit linked data will facilitate its popular adoption. The first framework we used was the ECOTree.js frame- work3 . This produced realistic looking mind maps from a 5. SYSTEM ARCHITECTURE simple textual specification of the nodes, as in the example This section describes the implementation of AutoMind in below. The instructions can be generated from the database Objective-C, which was the main development language for in a straightforward manner. iOS prior to the release of Swift in June 2014. The logic is entirely client side, but no data is stored locally. Data is retrieved on demand by sending a SPARQL query to an var myTree = new ECOTree(”myTree”,”myTreeContainer”); endpoint using an NSURLRequest object. myTree.add(0,-1,”Apex Node”); myTree.add(1,0,”Left Child”); myTree.UpdateTree(); 5.1 Concept Search and Selection The initial task is to locate the appropriate concept within DBPedia in response to a user’s search request. This could A major problem with the ECOTree framework is that it be done in several ways with SPARQL queries filtering on uses the HTML5 Canvas element, whose maximum size is the text string. In the initial implementation we chose to limited by iOS to 3 megapixels for devices with less than use the DBPedia Lookup Service instead2 . The service can 256 MB RAM and 5 megapixels for devices with greater or be used to look up DBpedia URIs by related keywords. equal than 256 MB RAM. This meant that maps beyond 30 For example the resource http://dbpedia.org/resource/ or so nodes could not be rendered on iOS, and the exported United_States can be looked up by the string ”USA” or HTML crashed an Android device on testing. ”United States”. In addition the results are ranked by the number of inlinks pointing from other Wikipedia pages at a We switched to the popular D3.js library4 for data driven result page. Therefore the quality of the returned results is documents. D3.js uses SVG to render the image and is not very high. However, the index required for the service has subject to such resource constraints (at least none that we not been maintained and is based on DBPedia 3.8 which is have discovered so far). Fortunately D3.js can generate tree four versions out of date. It is possible to build a new index diagrams from ”flat” data as well as its more typical JSON but the software provided does not compile a new version representation, so switching to D3.js mainly involved small and an extensive discussion on the help forums did not yield adjustments in the way the descriptions are generated from any results. In spite of some interest, no one seems to be the database. able to build a new index, and the original developers don’t appear to be very helpful. The newest versions therefore use SPARQL queries extended with Virtuoso server’s free text 5.3 Export Formats indexing capabilities. As previously noted, in addition to exporting the graphs as HTML containing the D3.js code, the application can export The search results are presented in a table view which al- to RTF, OPML, and as FreeMind mind mapping format. lows the user to select a row. This sends out a volley of Fortunately, since we chose to represent the basic descrip- SPARQL queries to retrieve data about the resource. There tion of the nodes and their relationships with the flat textual are three SPARQL endpoints in use at the moment: DBPe- descriptions, it was relatively easy to generate all of the for- dia at a private mirror which also includes the NYT open mats in the same traversal of the nodes. data, Gutenberg at the University of Mannheim and CIA at the University of Mannheim. The DBPedia triples are Figure 5 shows the relationship between a root node and served from a private mirror for two reasons. First, the pub- its first child as represented in the D3.js, OPML, and RTF lic DBPedia endpoint can be unreliable. Second, a curated formats. subset of the triples is more user friendly than the entire set available at the public endpoint. 6. EXPERIENCE There are a number of lessons to learn from developing a The triples which describe the resource are pooled and pre- commercial application in which there is at least an implicit sented to the user for selection. Of course different resources promise of speed, reliability, and continuity. will typically have a different set of triples. For example only countries have entries in the CIA World Factbook. Informa- First, obtaining data is not optimal, even though the data tion about the user selected nodes are stored in a SQLITE is open linked data. For example the dbpedia.org/sparql database through the Core Data framework offered in iOS. endpoint was not reliable during the development process, so we decided to establish our own mirror with a reliable 5.2 Graph Generation response time 5 . Unfortunately this did not help with the The goal was to automatically generate high quality mind 3 maps that could re adjust as nodes are added or deleted. We http://www.codeproject.com/Articles/16192/ Graphic-JavaScript-Tree-with-Layout decided not to implement our own graphics sub system, and 4 http://d3js.org/ 2 5 http://wiki.dbpedia.org/lookup/ Thanks to a reviewer for suggesting that the Linked Data CIA WorldFactbook data which is not available in RDF for the mind maps contributed by the community. In addition running on a local server. Instead they are generated from we aim to provide tools which can integrate mind maps with database dumps using D2R, and provided by the University overlapping concept nodes. of Leipzig and the Free University of Berlin, so the applica- tion is currently at the mercy of that service. If it ceases to 7. CONCLUSION be maintained, that part of the application will stop work- AutoMind is a new app for iOS which attempts to be a user ing. As an added problem, the DBPedia mapping file for friendly tool for humans to navigate their way through linked Project Gutenberg uses an identifier at the FU-Berlin, rather data, and to summarize their findings with the novel new than at the Gutenberg site. For example, the book ”The representation of Semantic Mind Maps. A popular adoption Motor Girls on a Tour” by Margaret Penrose has the iden- of the approach should drive a need for good quality data tifier http://www.gutenberg.org/ebooks/2789 at Guten- which will benefit the linked data effort. berg, whereas it appears as http://wifo5-04.informatik. uni-mannheim.de/gutendata/page/etext2789 in the DB- The development process taught us that working with linked Pedia mapping file. Therefore the developer has the choice data does not necessarily make the task of information search of performing some manual URL rewrites, or to use the ser- and integration easy. It can sometimes be difficult to link vice at the Uni-Mannheim site. This need for this kind of different relevant data sets, contrary to the linked data vi- hacking should not be necessary for linked data. sion. A uniform interface like Triple Pattern Fragments is certainly a step in the right direction, and we are keen to ex- The data contained in sources, primarily DBPedia, can be plore its implications for our application. Raw linked data patchy and idiosyncratic. Many concepts have very little can be confusing for novice users, prompting the need for useful information in DBPedia, while some have too many curation. Finally, the inclusion of open data in applications properties to easily assimilate. Further, the automatically controlled by large corporations can be challenging. extracted data can include non substantive properties like image size, whose use is further impaired by the unstruc- tured nature of linked data, which makes it impossible to 8. REFERENCES properly apply these predicates. For example there is no [1] Christian Bizer, Tom Heath and Tim Berners-Lee way to know which image has the image size property. Thus, (2009) Linked Data - The Story So Far. International the quality of mind maps created from linked data can vary Journal on Semantic Web and Information Systems, Vol. a great deal depending on the root concept. Our partial 5(3), Pages 1-22. DOI: 10.4018/jswis.2009081901 solution involved curation, where we limited the available [2] Buzan, T., and B. Buzan. (1993). TheMindMap book: DBPedia data sets in our server. How to use radiant thinking to maximise your brain’s untapped potential. New York:Plume Releasing the application through the Apple distribution [3] Budd, John. W., Mind Maps as Classroom Exercises. channels also proved challenging. During one of the iter- The Journal of Economic Education, Vol. 35, No. 1 ations an Apple reviewer noticed the New York Times data (Winter, 2004), pp. 35-46 and blocked the release until documentation could be pro- [4] Eppler, M. A comparison between concept maps, mind duced to prove that we had the rights to use the data. After maps, conceptual diagrams, and visual metaphors as many iterations in which we explained the status of open complementary tools for knowledge construction and data under the creative commons license, the app remained sharing. Information Visualization 5, 3, 202. blocked. Finally an appeal to the Appeals Board was able [5] Bia, A., Munoz, R., and Gomez, J. (2010). Using Mind to convince them to release the app, but with the condition Maps to Model Semistructured Documents. In Research that all screen shots of NYT pages were removed from the and Advanced Technology for Digital Libraries, Vol. app page. With this warning about potential future prob- 6273, pp. 421-424. Berlin, Heidelberg: Lecture Notes in lems in such a controlled ecosystem, we are planning on Computer Science, Springer. moving our effort to the Android platform. [6] Pink, D.H. (2005). Folksonomy. The New York Times, December 11, 2005. AutoMind was initially released as a paid application, but [7] Golder, S. and Huberman, B. A. (2006) Usage patterns subsequently a free version was made available. The free ver- of collaborative tagging systems. Journal of Information sion uses the public service end points with no guarantee of Science, 32(2):198–208. performance or reliability. In addition, the data is restricted to DBPedia and does not include the New York Times or Project Gutenberg links. The DBPedia data is not curated, and therefore includes significant number of predicates with dubious usefulness. In terms of further development, we are very keen to ex- tend the social computing aspect of the application. As already mentioned, the mind maps created by our users are uploaded to an FTP server. It is our intention to create a portal around this server through which users can explore Fragments and the Triple Pattern Fragments interface may be helpful. We will certainly explore how this might be of help. {"name":"Van Gogh Museum", "parent" : "null", "URL":"http://en.wikipedia.org/wiki/Van_Gogh_Museum ","icon":"http://commons.wikimedia.org/wiki/Special:FilePath/Van_Gogh_Museum_Amsterdam.jpg?width=300" }, {"name":"wikiPageExternalLink..(9)", "parent" : "Van Gogh Museum", "URL":"http://dbpedia.org/ontology/wikiPageExtern {"name":"http://www.vangoghmuseum.nl/(10)", "parent" : "wikiPageExternalLink..(9)", "URL":"http://www.vangoghmuseum. VanGoghMuseum {\rtf1\ansi\deff0 {\fonttbl {\f0 AppleCasual;}} {\colortbl;\red0\green0\blue0;\red255\green0\blue0;} Van Gogh Museum\line(http://en.wikipedia.org/wiki/Van_Gogh_Museum)\line\line \tab\f0\bullet wikiPageExternalLink..\line \tab\tab\f1\bullet http://www.vangoghmuseum.nl/ (http://www.vangoghmuseum.nl/)\line Figure 5: The same facts in D3.js, OPML and RTF