=Paper=
{{Paper
|id=Vol-1311/paper9
|storemode=property
|title=VISFACET: Facet Visualization Module for Modern Library Catalogues
|pdfUrl=https://ceur-ws.org/Vol-1311/paper9.pdf
|volume=Vol-1311
}}
==VISFACET: Facet Visualization Module for Modern Library Catalogues==
VISFACET: Facet Visualization Module for Modern Library Catalogues Miriam Allalouf,1 Dalia Mendelsson,2 and Evgeniy Mishustin1 1 Azrieli College of Engineering, Jerusalem 2 The Hebrew University of Jerusalem Abstract. The “next-generation” catalogues of academic libraries provide a dis- covery layer that contains faceted classification and search features and suggested topics selected by their rankings. To improve the discovery process, this paper demonstrates an interactive faceted visualization box, termed VisFacet, that ex- tends the catalog interface and allows users to narrow or broaden their search results filtered by suggested topics or facets in an interactive manner. The Vis- Facet software visualization module was integrated into and contributed to the VuFind open source software system. VuFind is a development portal that en- ables libraries to customize their own catalogue interfaces and discovery layer. Thus, extending VuFind with VisFacet provides all catalogues using VuFind at their system’s core with the benefit of having an infographic interactive search box. We will describe the challenges encountered during the development of the VisFacet project, including the user discovery satisfaction questionnaire results. 1 Introduction The traditional academic library catalogue enables end users to search for and find requested information. With the development of the Internet, and the intro- duction of Google as a major player in the information retrieval arena, the search patterns began to change. The Google-like search method influenced the way that library catalogues are being developed to offer a similar user experience [1], [2]. Traditional library catalogues were designed for the tasks of searching, identify- ing, selecting, and accessing information. The information retrieval is based on predefined indexes. The user had to be familiar with the specific fields involved in the information retrieval process. For example, users were expected to know how to construct complex search queries, utilize subject searching, and apply Boolean search operators [3]. Over time, the catalogue did change and a new generation of catalogues became more popular in academic libraries. The breakthrough of providing web-based online public access to library collections and resources did not require changes in the Integrated Library Systems’ (ILS) core. The new concept was developed only at the presentation layer of the Online Public Ac- cess Catalogue (OPAC). Additional indexing at the presentation layer provided the essential discovery infrastructure to the “next-generation” catalogues. Nowa- days, libraries’ next-generation catalogues offer academic resources based on the advances of the information retrieval technologies; they are trying to meet the readers’ new expectations and enhance their experience by making library cata- logues more user friendly, intuitive, and visually attractive [4]. The new discov- ery systems contain predictive search features (such as “Did you mean?”); user profile-aware content, such as tags, ratings, reviews, and comments; and a faceted classification. The faceted classification feature that classifies items into multiple independent categorization schemes (or facets) is a topic from library science that has become popular in information computing [5]. In this important approach, the information retrieval system allows the assignment of an object to multiple attributes, thus enabling the classification to be ordered in multiple ways, rather than in a single, predetermined order. For example, a collection of books might be classified using an Author facet, a Region facet, an Era facet, and so on. The Region facet consists of clusters of records from the list, each of which is associated with a region- related keyword. Performing a search of the catalogue, the system presents, along with the usual search engine results page (SERP), different predefined facets. By clicking on them, the user can narrow his search and refine the results set. All the suggested information is presented to users in textual format in several separated fields of the SERP. Using faceted navigation with suggested topics has proven to be one possible way of enhancing the user’s navigation and exploration, but with the growing numbers of library materials, and users who demand easy access and high-performance systems, there is still a need for further development. Information visualization is an emerging component in many scientific research areas such as digital libraries, data mining, and financial data analysis. Merčun and Žumer in [6] discuss the demonstrated importance of visual information in library catalogue presentations, particularly in simple visualizations such as tag clouds, time sliders for refining search, featured cover displays of new acquisi- tions, recently borrowed items, bookshelf display, and vocabularies such as “word cloud” in Aquabrowser [7]. In our opinion the next-generation catalogue, being an exploratory tool designed for discovery, must add the potential of informa- tion visualization for exploratory faceted search visualization to their capabili- ties. This paper describes the VisFacet plug-in framework we have designed and developed. VisFacet is an interactive, faceted search and navigation visualization system. It uses a book’s existing metadata information to provide an intuitively understood and visually attractive graph of facets. It appears within and extends the SERP. The VisFacet software modules have been integrated into the VuFind system. There is currently no next-generation catalogue that integrates a visual- ized and interactive faceted search. VuFind (www.vufind.org) is an open source, next-generation library resource portal that enables the development of a customized, faceted navigation system. VuFind provides a full set of modules to produce a customized cataloging sys- tem layer in which libraries can implement all the features they want, modify the existing modules to best fit their needs, and add new modules to extend their re- source offerings. The Library Authority of the Hebrew University customized the VuFind system modules and built the HUfind next-generation catalogue. HUfind has been in production since 2012. Our VisFacet framework extends the VuFind system’s capabilities with the visualization feature, and all catalogues that use VuFind at their system’s core (such as HUfind) can benefit from VisFacet. More- over, today there are several development platforms for modern cataloging of library systems, and none of them has a faceted visualization framework such as we are suggesting. VisFacet makes three main contributions: 1. It adds the visualized, interactive search capabilities of the facets to the VuFind system. The view includes objects filtered by a variety of similar topics and by facets such as Era and Region. 2. It adds a broadened search capability to the visualized view, which is a com- pletely new feature. 3. It contributes the code to the VuFind open source project. 2 Related Works In chapters thirteen and fourteen of their book [8], Shneiderman et al. provide a comprehensive survey of human-computer interaction in general, particularly in the context of information visualization for information search and retrieval. Latest trends in the information technology arena show that visualized representa- tion of the suggested topics, where the user can see the relations between clusters drawn graphically and click on them, is more understandable and intuitive [9], [10], [6]. There are several works that explore data visualization as a way to en- hance the digital library results page. ResultMaps [11] is a treemap-based search visualization system, developed as an extension to digital libraries. It uses hierar- chical subject classification to map each repository document into a treemap and highlight items that correspond to the current query. ResultMaps works very well for small repositories consisting of hundreds to thousands of records, but does not scale well, since the interface quickly becomes unreadable with the growth of the repository. FacetMap [12] is another visualization approach to faceted navigation systems that enables the visualization by interacting with large databases. Though it is limited to the Windows operating system rather than the web. AquaBrowser [7] is a commercial next-generation library catalogue that provides its own data retrieval algorithms and unique features to satisfy the end user’s high require- ments. The “Discover” feature gives a similar visualization look to our VisFacet. It does not provide a narrow faceted-search capability: it performs a new search each time a user clicks on it. Tiara [13] is a text visualization system that was built to aid users in exploring and analyzing large text collections: given a user query, Tiara provides rich graphic interactions with informative and powerful visual text summaries. Thai et al. [14] suggest a new visual ordering of faceted visualization in which a matrix-based multidimensional visualization is used for modeling the relations between documents. There is a tight correlation between the information format, type of analysis the information requires, and the visualization design. 3 VisFacet Framework The aim of this research was to enrich the discovery layer of the catalogue with an interactive side-box that presents the facets of the search results graphically. Fig. 1 presents the current textual view as provided by the VuFind package. The term for the search in this example is the word “king.” The area above a set of results presents a list of Suggested Topics that were found in the records addressing the term “king,” such as “history” and “kings and rulers.” The area on the right provides us with the facets—namely, the books taken from the left column filtered by a variety of subjects. For example, within the “Author” category we can see a cluster that contains all the records (202) from the current set of search results that appear to be written by William Shakespeare. Thus, the search system allows the user to search from one search box and then narrow down the results by clicking on the various facets of the results. The VuFind information retrieval engine, as well as that of most modern library catalogues, uses the MARC records as the core for its search engine index schema over the ILS core. MARC (MAchine-Readable Cataloging) is the de-facto stan- dard for the bibliographical records with metadata for each item (that is, book, journal, movie, and so on). The metadata are inserted as indexes in the VuFind internal database, along with their associated metadata. Given a term to search, the catalogue’s search engine retrieves associated records from the search engine. At the same time, it retrieves a list of predefined facets (sometimes also called Fig. 1. VuFind with VisFacet integrated. clusters), each of which represents a logical conjunction of the current set of the results filtered by several predefined fields. For example, the Topic field is repre- sented in the MARC records with the MARC code 650. If we choose to search for the topic “history,” the results list will display the records where the topic field, represented by 6##, holds the value of a number of records that include the field 6## with the word “history.” Choosing another subject like “second world war” will narrow the search, and all the resulting records that contain “history” but not “second world war” will be excluded from the current results list. A visualized discovery box, which is intuitive, easy to interact with, and which integrates smoothly with web interfaces, is an essential addition to these evolv- ing software systems. However, current commercial companies and academic li- braries that develop tools for the customization of the discovery layer do not have this capability. Thus, an additional aim of this research was to extend the com- plex, modern catalogue software package with visualization capabilities while re- taining its modularity and performance quality. Any visualization solution must integrate smoothly with the existing software architecture and be capable of using the retrieved information as any other other facet. Our tool code was integrated into the VuFind open source system so that the Vis- Facet box is added to the user interface of the catalogue below the right facets section of the SERP (Fig. 1). The location can be changed to above the facets, should the system administrator wish to configure it that way. Section 3.1 de- scribes the targets we addressed in the graphical design of VisFacet. In section 3.2 we describe VisFacet’s modules and how they were integrated into VuFind. Fig. 2. Enlarged view of Visfacet. The suggested topic “History” for the searched term “King” appears in the middle. The “History” topic occurs in the metadata topic field 4915 times. 3.1 Visual Design In the example presented in Fig. 2, the topic node that appears in the center of the star-like graph is “History,” a topic suggested by the system because it appears in the largest number of items that came up in a search for the keyword “king.” The rest of the topic nodes (in green), each representing a cluster of results, are connected to it in descending order in a spiral form according to their grading. Era in orange and Region in blue are two additional facets of our visualization. The color of each facet appears in the legend at the bottom. All the colors (including background) and sizes are configurable and adjustable. To save space, only the first word in a term appears near the nodes in the graph. When a user selects a node (by brushing the mouse over it) the full topic of the current node appears below the graph. Any click on one of the nodes in the graph will narrow (assuming the toggle at the top is on the Narrow position) the current search results list according to the specific Topic, Era, or Region that was clicked. For example, clicking on the “France” Region facet will drop all records that do not contain France as their re- gion. The Narrow toggle that appears at the top of the box is on by default, while the Discovery toggle is dimmed. Clicking on any of the nodes when Discover is on will trigger a completely new query with the clicked topic as a keyword. Since all of those topics were retrieved from the search engine index, this “Discover” function will never lead to an empty results set. The graph is redrawn with each search operation, whether text or graphic. Readers of this paper are welcome to try it at http://hufind.huji.ac.il/. The visualization of the suggested topics and the facets should help the end user discover a subtopic to focus on from a wide-area topic. To address this require- ment we have applied several dimensions to the two-dimensional layout of the infographic box as follows: 1. Link-Node form: A star-like graph where the topics are connected with an explicit link to the center that intuitively highlights the relation between the topics. 2. Spiral form and order: Another dimension of connectivity arises from the fact that we have arranged all the topics in a spiral form, according to their frequency. For example, the topic History that occurs 4915 times in the re- sulting items appears in the middle of Fig. 2. It is closely connected with the kings and rulers node that occurs 2905 times in the resulting items. The other nodes appear in the spiral according to the number of occurrences that can be seen in the “Suggested Topics” box on the top of Fig. 1. The benefit of using a spiral form is that it spreads the topics evenly across any stretched box to optimize space usage as well as implicitly isolate the terms. Moreover, users gain a sense of the importance of the topics through their closeness to the center and thus may navigate between them using the spiral route. To draw the spiral, we redesigned Vogel’s formulation [15] of Fermat’s Spi- ral [16] to fit our needs. The graphical objects are drawn within a canvas with predefined width and length. For each object in the list, the algorithm calculates its polar coordinates, which are composed of the angle and radius relative to the center of the graph. 3. Colors are used to distinguish among the categories, such as Era and Region. Fig. 3. VuFind with VisFacet integrated. 3.2 System Design and Implementation Incorporating a visualization tool into the VuFind system, which is a generic en- gine for building and customizing library catalogues, is a natural evolvement. The VuFind system has complicated software engineering and many code lines. Each part of the system is implemented in a different programming language that is best suited for its particular feature. Hence, the architecture and programming languages of VisFacet were selected and designed in terms of code coherency and performance. The VuFind architecture, shown on the left side of Fig. 3, con- sists of an application core and two main layers. The data layer contains a search engine index that can be distributed among several machines, or even different Q1 Text Search: Does the Region facet helped to focus on a location? Answer options: 1. very helpful 2. gave ideas 3. little help 4. no help The rates for the answers are 3, 2, 1, 0 respectively Q2 Graphic Search: Does Suggested Topics help to discover a topic? Answer options: 1. very helpful 2. gave ideas 3. little help 4. no help The rates for the answers are 3, 2, 1, 0 respectively Q3 Graphic Search: Does the Region Facet help to focus on a location? Answer options: 1. very helpful, 2. gave ideas, 3. little help 4. no help The rates for the answers are 3, 2, 1, 0 respectively Q4 How many steps did it take to focus on a topic in each search type? Q5 Which search type better highlights the relationship between the Suggested Topics in terms of their number of occurrences? Answer options: 1. Graphic is better than text 2. Text search is enough 3. The graphic search is good only when combined with Text search Q6 Which discovery is best? 1. Text 2. Graphic 3. Combined search Table 1. A questionnaire that was presented to both groups: librarians and Fig. 4. Answers for questions 1-3 compared between the two groups. Each col- students. Each question presented in this table is followed by the possible answers and umn presents the average rate of the answers for Q1 (TextRegionScore), Q2 (Graphic- the rate for each option. SuggTopScore) and Q3 (GraphicRegionScore). The flowers in the middle of each col- umn show the Q1, Q2, and Q3 standard deviations: 0.6, 0.94, and 0.91 for the students, and 0.53, 0.44, and 1.05 for the librarians. libraries, and be updated daily. The application core that runs on the server side and is responsible for bringing data from the data layer performs all the required processing and then passes it to the user interface layer. Finally, the user interface layer is responsible for arranging the data. The VisFacet visualization subsystem is divided into three main components, each incorporated into the appropriate layer of the VuFind system, as can be seen in Fig. 3. The retrieval module, was written in PHP, retrieves the data from the search engine index. It obtains infor- mation about a user’s current search, performs its own internal processing, and then returns all the required information needed to render suggestions. This in- formation is processed and organized for the visualization in the Glue UI module that runs in the browser on the client side and was written in Javascript. The Glue UI module binds its own functions to the visualization module and “listens” to in- teractions. The visualization module gets the organized data and translates it into the visualization objects, which will be presented graphically in a web browser. We incorporated the Processing.js package (http://www.processingjs.org) that is used to create images, animations, and online interactions by using the visual pro- gramming language named Processing. It also converts the Processing code into Javascript, thus allowing it to be run by any HTML5-compatible browser, includ- ing mobile browsers and current versions of Firefox, Safari, Chrome, Opera, and Internet Explorer. We implemented the visualization module including the spiral algorithm, described in 3.1, in the Processing proprietary language. Behind the scenes, the Processing.js library compiles our code to pure JavaScript. 4 System Setup, User Study, and Evaluation Results The VisFacet package was set up, integrated, and tested for two environmental systems: HUfind, which is a customized project based on VuFind, and VuFind itself. After it had passed HUJI library’s approval regarding the usage patterns and performance tests, it was added to the production version of HUfind in May 2014. The site at http://hufind.huji.ac.il/ allows thousands of HUfind users to use a graphic search. Note that it is enabled only when using browsers other than In- ternet Explorer. To learn more about user satisfaction, we wrote a first-stage eval- uation questionnaire, which is described in this section. We designed the ques- tionnaire to measure the experiences of users wishing to discover a new research topic to focus on. The questionnaire was sent to two groups of users: (1) an un- dergraduate student group composed of 21 students, and (2) a group of librarians composed of 10 experienced librarians who are familiar with the HUfind cata- logue. The questionnaire itself consists of two search tasks: the Text search that asks the user to search the suggested topics for the keyword journalists via the regular text interface and the Graphic search that requires the use of our info- graphic box. Each task requires the exploration of the Suggested Topics and the Region facet in order to focus on a direction. The search experience is ques- tioned in each path with questions that appear in Table 1. The user is asked to compare both experiences and determine whether she prefers one, the other, or a combination of the two. The columns in Fig. 4 present the average rate of the answers to Q1, Q2, and Q3, according to the group involved: librarians or students. On average, both groups preferred the Text search over the Graphic search. The answers for the graphic evaluation were mainly gave some ideas or very little. The librarians, who are used to text searches, were more skeptical about the graphic visualization, while the students liked it more than the librarians. Some of them preferred the graphic visualization, and therefore the standard deviation for their answers was higher. However, 16 students (76%) answered, in Q6, that they prefer having both types of search. This shows that they are open to this direction but want an improved display. We received the following two comments about VisFacet: (1) the search is less user friendly because it requires brushing long words with the mouse to make them fully visible, and (2) the importance of the order of nodes in the spiral form was not intuitively understood. We will relate to both problems in the next version. The number of steps taken by the student group in both types of search were very similar, 1.91 steps on average, with a standard deviation of 0.80-0.84. The number of steps taken by the librarians was 1.8 steps on average, which is slightly smaller than in the other group, and it was the same for both types of search. Since the questionnaire is not a real discovery task, it is difficult to draw conclusions from this result. To gain greater benefit from VisFacet, we plan to provide additional guidance for its use as well as monitor its use and find out how we can improve it. 5 Conclusion We have implemented a subsystem that adds an interactive visualized and faceted search to a modern catalogue. Because our subsystem is a new feature of VuFind, we have explored the existing system to understand its current state and find the appropriate solutions for our design. VisFacet is an initial suggestion for faceted visualization that can be a real impetus for extending the visualization capabilities of modern library catalogues. Acknowledgments We thank Demian Katz, the founder and main developer of the VuFind system, for his ongoing help in enabling the smooth integration of VisFacet into VuFind. We also thank Edith Falk, the chief librarian of the Hebrew University for all her support in this project, and Eli Hayun, the programmer at the Library Authority of HUJI for helping with the integration with HUfind. We thanks Nurit Baltiansky, Mally Cohen, and Avi Allalouf for the discussions regarding the questionnaires. References [1] Emanuel, J.: Usability of the vufind next-generation online catalog. Infor- mation Technology and Libraries 30 (2011) [2] Ramdeen, S., Hemminger, B.M.: A tale of two interfaces: How facets affect the library catalog search. JASIST 63 (2012) 702–715 [3] Merčun, T., Žumer, M.: New generation of catalogues for the new generation of users: A comparison of six library catalogues. Program (2008) [4] Tennant, R.: Lipstick on a pig. Library Journal 130 (2005) [5] Garfield, E.: A tribute to s.r. ranganathan, the father of indian library science. Essays of an Information Scientist 7 (1984) [6] Merčun, T., Žumer, M.: Visualizing for explorations and discovery. In: Libraries in the Digital Age. (2010) 104–115 [7] Kaizer, J., Hodge, A.: Aquabrowser library: Search, discover, refine. Library Hi Tech News (2005) [8] Shneiderman, B., Plaisant., C., Cohen, M., Jacobs, S. In: Designing the User Interface: Strategies for Effective Human-Computer Interaction (5th Edition), Pearson Education (2014) [9] Tergan, S.O., Keller, T. In: Knowledge and Information Visualization, Searching for Synergies. (2005) [10] Thudt, A., Hinrichs, U., Carpendale, S.: The bohemian bookshelf: Support- ing serendipitous book discoveries through information visualization. In: Proceedings of the SIGCHI, ACM (2012) [11] Clarkson, E., Desai, K., Foley, J.: Resultmaps: Visualization for search interfaces. IEEE Trans. on Visualization and Computer Graphics (2009) [12] Smith, G., Czerwinski, M., Meyers, B., Robbins, D., Robertson, G., Tan, D.S.: Facetmap: A scalable search and browse visualization. IEEE Transac- tions on Visualization and Computer Graphics 15 (2006) [13] Wei, F., Liu, S., Song, Y., Pan, S., Zhou, M.X., Qian, W., Shi, L., Tan, L., Zhang, Q.: Tiara: A visual exploratory text analytic system. In: Proceedings of the 16th ACM SIGKDD Conference, ACM (2010) [14] Thai, V., Rouille, P.Y., Handschuh, S.: Visual abstraction and ordering in faceted browsing of text collections. ACM Trans. Intell. Syst. Techn. (2012) [15] vogel h: A better way to construct the sunflower head. Mathematical Bio- sciences 44 (1979) 179–189 [16] spiral, F.: (http://en.wikipedia.org/wiki/fermat’s spiral)