Individual Differences in Exploration and Content Curation Activities within a Cultural Heritage Digital Library Paula Goodale University of Sheffield Sheffield United Kingdom p.goodale@sheffield.ac.uk ABSTRACT Developers of information seeking support systems that intend to This paper presents empirical results from an evaluation study of a support users in more exploratory and creative tasks, including cultural heritage digital library. It focuses on the differences in cultural heritage digital libraries, should therefore seek to provide preferences between novice and expert users for functionality tools for orientation, finding (non-search), and curating content. supporting browsing and exploration, when engaged in orientation This paper aims to examine these requirements via a laboratory- and content curation tasks. Findings indicate both similarities and based evaluation study of an experimental system (PATHS1) that differences between novice and expert users. Recommendations offers these types of functionality for a large-scale aggregated for future work are proposed. cultural heritage digital library, based upon a UK sub-set of the Europeana2 content. Specifically, the paper aims to investigate Keywords any potential differences in the preferences of novice and expert Novice, expert, information access, information seeking, users for these types of tools when engaged in orientation, finding exploration, content curation, cultural heritage, digital library. and content curation tasks. 1. INTRODUCTION 2. RELATED WORK As digital cultural heritage collections become larger and more widely available, they are targeted at more diverse user 2.1 Information seeking tasks and systems in communities with varying levels of subject and domain digital cultural heritage knowledge. No longer the preserve of scholarly researchers, they Information seeking tasks in the cultural heritage domain are often also seek to engage users with general as well as specialist more complex and/or exploratory in nature, including subject- knowledge, for leisure and education purposes. Users are based searches and less-focused activities, where there is a higher therefore likely to span across a continuum from novice to expert, degree of uncertainty in what is being sought [8][1]. Exploratory with varying interests in the library content, varying degrees of information seeking activities go beyond simple look-up or subject and domain knowledge, and different types of task that are known-item search, incorporating elements of learning (acquiring, likely to be undertaken. interpreting, comparing, etc.) and investigation (analysis, Novice users (low subject and domain knowledge) frequently evaluation, synthesis, transformation, etc.) [6]. Information experience difficulties in finding content via web search and in seeking support systems in the area of exploratory search digital libraries of all kinds, particularly when the task is less therefore require a wider range of functionality to support these focused and more exploratory in nature. Their lack of subject and more complex activities [7, 15]. domain knowledge inhibits the successful use of the search box, The wider range of user interactions in the cultural heritage as keyword formulation and reformulation often proves difficult. domain incorporates content curation and support categories [10]. In contrast, expert users, with higher levels of subject and domain The second category, curation goes beyond finding into various knowledge, are more confident in search, as they have a repertoire elements of information use, including the addition of annotations, of topics and associated keywords to draw upon. creation of user exhibitions from available content, and It might therefore be expected that novice users will have a storytelling [10]. These activities are more closely aligned with preference for tools which support browsing and exploration information use than with information finding (searching, (discovery) of the digital library content, especially in more browsing and exploration), and represent an opportunity for diverse and large-scale collections. As digital collections grow, cultural heritage digital libraries to provide wider access to individually and in aggregate forms, simple orientation content and to support reuse and creativity. (understanding ‘what’s here’, i.e. which topics are covered and in Another important element of user requirements in digital cultural what depth) can be challenging, and might need to be addressed heritage is visual representation of collection items [9]. Support even before exploration of the content can begin. Additionally, for serendipity can also prove to be beneficial and popular with discovery tools should support the needs of novice users in users engaged in less-focused information seeking tasks [12]. finding and selecting content for topic-focused tasks. This need is likely to be especially acute when an element of creativity and 2.2 Novice and expert user differences synthesis is involved, such as content curation. Differences in the needs and behaviors of novice and expert information seekers has been researched in many domains. In web In Proceedings of 1st International Workshop on Accessing Cultural Heritage at Scale (ACHS’16), 22nd June 2016, Newark, NJ, USA. 1 Copyright 2016 for this paper by its authors. Copying permitted for PATHS Project: http://www.paths-proejct.eu private and academic purposes. 2 Europeana: http://www.europeana.eu/portal/ search, domain expertise results in different search strategies and 3.2 Sample more successful results in finding relevant content [13]. Domain Sample size was 34 participants, comprising 24 novice users and knowledge also results in more focused, systematic search tactics 10 expert users. Novice users were categorized as those with a within digital libraries [14]. However, whilst domain knowledge more general knowledge of cultural heritage (low subject/domain enhances search success, technical skills may offset this to some knowledge), and expert users as those with a higher degree of degree, thereby indicating that those lacking in both domain subject knowledge gained from accessing cultural heritage knowledge and web search expertise are doubly disadvantaged collections for work-related use. A majority of users (n=32) self- [4]. reported either an intermediate or high level of experience in In the cultural heritage domain, more experienced users are likely using web search, which it has been suggested may offset a lack to be scholars and researchers in humanities subject areas, as well of subject and domain knowledge to some degree [4]. as cultural heritage professionals, whilst less experienced users may be from educational and general interest categories [3, 11]. Expert users in cultural heritage undertake a wide variety of tasks including known-item search and more exploratory activities [1]. Moreover, novice users involved in leisure activities also undertake a variety of information seeking tasks, and are highly visually focused, as well as engaging in elements of meaning- making [9]. 3. METHODS The results presented in this paper are derived from a comprehensive evaluation study of a prototype of an information seeking support system designed to investigate functionality for the support of exploration and curation of content in large-scale cultural heritage digital libraries, created during the PATHS project. The study was carried out under controlled conditions in a laboratory setting, utilizing a variety of simulated work tasks [2] Figure 1: PATHS Screenshot – thesaurus exploration as a means of gaining feedback on system usability and usefulness, to inform future system design, and to investigate user preferences, behaviors, and interactions in this relatively novel context. Screenshots of the system are shown in figures 1-3 below, illustrating thesaurus, map and path functionality, offered as different means of exploring the content in the collection and of curating content. The prototype PATHS system contained c.1 million items selected from UK institutions in the Europeana digital library, 3.1 Tasks During the evaluation session users were invited to complete five short orientation and information seeking tasks lasting 5 minutes each, followed by one 30-minute content curation task. This paper focuses on the results of one of the orientation tasks and the content curation task. Figure 2: PATHS Screenshot – map exploration The orientation task required users to investigate the topics available in the collection, using any of three tools designed to support browsing and exploration (thesaurus, tag cloud and map). Feedback was then supplied on the ease of use and usefulness of each tool using 5-point semantic differential scales, and the user’s rank order of preference for the three tools (1st, 2nd, 3rd). The content curation task entailed finding and selecting content (items held within the digital library) on a topic of the user’s own choice, then organizing and annotating these items to form a meaningful route (path) through the collection. This task therefore required the user to employ tactics to find content via the search box and/or the exploration tools used in the earlier orientation task, as well as the more creative element of the activity. The whole task can be considered as exploratory [5] as it is relatively non-prescriptive and open-ended, and incorporates elements of discovery and synthesis [6]. Figure 3: PATHS Screenshot – path creation interface 4. RESULTS unexpected finding for search results pages may arise from more Data from user feedback on the two tasks was analysed for user successful searches by expert users, or simply that they had a differences according to the novice and expert categorization. better idea of what they were looking for and would ‘know it when I see it’. 4.1 Orientation Overall then, it seems that novices rate the thesaurus most highly Both novice (66.7%) and expert (70%) user types were emphatic of all the exploratory tools offered, and that experts are more in their placement of the thesaurus as the most useful for aiding likely to find a wider range of tools useful, including those such as orientation, i.e. finding out ‘what’s here’ (Table 1). There was facets and subject metadata that might require more specialist more of a split for the tag cloud and the map, with a majority of knowledge to interpret. novice users placing each of these in 3rd place, whilst expert users placed these more emphatically in 2nd and 3rd paces respectively. 4.3 Curating content A majority of both user types placed the relatively novel ‘map’ The first stage of curating content is to select items for inclusion. tool in third place, although more of each type also placed it in Whilst directly related to finding content, there is a more active first position than they did the tag cloud. This difference may be level of intellectual effort, with choices being made amongst accounted for by the relative novelty of the map, but other factors available content, and potentially disregarding some items in may also be at play, such as a preference for image vs text favor of others. Users gave feedback on both the information used visualizations. to make these decisions and the criteria by which items were selected. Thesaurus Tag cloud Map As expected, all users, novice and expert, favored images as a 1st 66.7% 12.5% 20.8% primary element of their decision-making process (Table 2). This Novice 2nd 33.3% 41.7% 25.0% is unsurprising since it is widely accepted that using cultural heritage collections is a highly visual process, and the curatorial 3rd 0.0% 45.8% 54.2% task may be even more visual in nature. It is also clear that 1st 70.0% 0.0% 30.0% novices used much less ‘other’ non-visual information than expert users in making their selections. This difference is most marked in Expert 2nd 10.0% 80.0% 10.0% relation to metadata, used by 60% of expert users, but only 12.5% 3rd 20.0% 20.0% 60.0% of novice users. Table 1: Preference for exploration tools, novice/expert users Novice Expert Similarly, 79% of novice users and 80% of expert users rated the thesaurus as either very useful or useful, and 75% and 90% image 95.8% 100.0% respectively rated it as very easy or easy to use, on 5-point Information title 66.7% 80.0% semantic differential scales. However, a difference of opinion was used found on the tag cloud, with novice users rating it as less useful description 50.0% 70.0% (33%) and easy to use (50%), than expert users (80% each useful metadata 12.5% 60.0% and easy to use). In contrast, novices were somewhat more favorable towards the map tool, 46% useful and easy to use, than typical 75.0% 40.0% expert users, 40% useful and easy to use. unusual/unique 4.2% 10.0% Criteria It seems therefore that the thesaurus is the overall winner for both aesthetics 62.5% 60.0% used user types, but that novice users found the map more useful than the tag cloud, and vice versa for expert users. interesting 29.2% 30.0% 4.2 Finding content available 33.3% 30.0% Feedback on the usefulness of tools in finding content of interest Table 2: Information and criteria used for selecting content, for the content curation task was given on a wider range of novice/expert users functionality, including the search box, the thesaurus, tag cloud Criteria used for inclusion of specific items had commonalities and map tools, browsing of search results and filtering using and differences (Table 2). Novices and experts were relatively facets, recommendations in the form of selected (featured) and similar in their choice of aesthetically pleasing items (62.5% and related items, metadata, and links to background information in 60% respectively), reinforcing the finding on the importance of Wikipedia. Again a 5-point differential scale was used (very images. Both user types were similar in their selection based upon useful to useless), with an additional category for ‘did not use’. interesting descriptions and choosing the only items available on As might be expected, all users used the search box, although their chosen topic. However, novices (75%) were much more expert users were more emphatic in it being very useful (80%) likely to choose typical examples than expert users (40%). than novice users (66.7%). As in the orientation task, the At the next stage of content curation, the items must be arranged thesaurus was deemed the most useful exploration tool, with 46% in some order and might also be augmented with annotations to of novices finding it very useful or useful, compared with 20% of add context and aid understanding by the eventual user. There is a expert users. striking difference between novice and expert users in ordering Expert users were more likely to rate the usefulness of metadata their content. Expert users arranged content by theme (40%) and driven tools, including facets (40%) and metadata keyword links narrative (50%). A majority of novice users also preferred a (80%) than novice users (25% and 42% respectively). thematic arrangement (54%), but smaller proportions used criteria Interestingly, experts were also more likely to find useful the such as chronology, geography, narrative, geography, importance, recommendations in the form of related and selected items, and and no particular order. This may indicate that experts have a browsing of search results pages, than novice users. This more specific idea about the nature of curation, incorporating themes and narratives, but it is also clear that less-experienced subject and domain knowledge, it is therefore necessary to users are also drawn towards thematic arrangements. understand and accommodate these user requirements and Finally, novice users were less critical of the curated content they differences through functionality that supports a range of produced during this task. Rating the quality of their output on a preferences and abilities. scale of 1-10, 21% of novices selected a score of 6 or above, In future work we will also undertake more detailed analysis of compared to none of the expert users. In contrast, 60% of experts actual user behavior from screen recordings and transaction logs. rated their output in the range 1-3, compared with 50% of novices. This will provide a useful contrast in what users report as Additionally, the highest rating given by expert users was 5 out of preferences and choices, against what functionality they use in 10, compared to 9 out of 10 for novice users. It seems that expert practice, as well as uncovering sequences and patterns of users had a clearer idea of what their curated content should look behavior, providing a basis for recommendations for system like, both in terms of arrangement and quality of content. In free design for the support of exploration in cultural heritage text feedback, many users commented that they would like better collections. Further, more naturalistic studies of users interacting quality images and time to add more contextual annotations to with systems that are in the public domain, undertaking their own their curated content. tasks under less controlled conditions will also be of interest, to provide insights into the levels of take-up and actual usage of 5. DISCUSSION these types of information seeking support tools in cultural During this study, we have investigated the differences between heritage collections ‘in the wild’. novice and expert users in their preferences and choices for tools to support more exploratory information seeking and in 7. ACKNOWLEDGMENTS information use in the form of content curation, within the context The research leading to these results has received funding from of a large-scale aggregated cultural heritage digital library. Whilst the European Community’s Seventh Framework Programme search was still the primary choice for all users, novices were (FP7/2007-2013) under grant agreement no. 270082. The author more likely to use exploratory tools to augment their orientation acknowledges the contribution of all project partners involved in and finding activities. Specifically, novices were found to be more PATHS (see: http://www.paths-project.eu). pre-disposed to using a thesaurus tool for exploration of the content than expert users, and were also more open to using other exploratory tools. In contrast, experts were more likely to make use of more specialist tools based upon collection metadata, such 8. REFERENCES as facets and subject keywords. [1] Amin, A. et al. 2008. Understanding Cultural Heritage Experts ’ Information Seeking Needs. JCDL’08, June 16- Given the challenges experienced by novices from lower levels of 20, 2008, Pittsburgh, Pennsylvania, USA, 39–47. subject and domain knowledge, it is likely that these results may be at least partially explained by the support provided by the [2] Borlund, P. 2010. Reconsideration of the Simulated exploration tools in overcoming this lower level of knowledge. Work Task Situation : A Context Instrument for The thesaurus in particular lays open the main topics within the Evaluation of Information Retrieval Interaction. IIiX collection, and is easy to navigate, comprising hierarchical 2010, August 18-21, 2010, New Brunswick, NJ, USA, categories and sub-categories. A further bonus may be that the 155–164. thesaurus was derived from Wikipedia subject headings [ref [3] Goodale, P. et al. 2011. D 1 . 1 User Requirements anon], giving a more informal level of access to subject-related Analysis. PATHS Project http://www.paths- content. project.eu/eng/Resources. However, differences by novice and expert categorization may not [4] Hölscher, C. and Strube, G. 2000. Web search behavior be the only factors affecting accessibility of cultural heritage of Internet experts and newbies. Computer Networks. 33, content. Previous analyses of this evaluation study have also 1-6, 337–346. identified differences in behavior and preferences according to cognitive style [ref anon], selected demographics [ref anon] and [5] Kules, B. and Hill, C. 2009. Designing Exploratory variations in the system functionality from simple to more Search Tasks for User Studies of Information Seeking complex [ref anon]. It is therefore even more pertinent to consider Support Systems. JDCL’09, June 15-19, 2009, Austin, designing for a diverse range of users to ensure the greatest TX, USA. potential for increasing access, although perhaps focusing on [6] Marchionini, G. 2006. Exploratory Search: From finding those tools that aid the widest range of users, in this case the to understanding. Communications of the ACM. 49, 4, thesaurus which was well-received by novices and experts alike. 41–46. 6. CONCLUSIONS [7] Shneiderman, B.E.N. 2000. Creating Creativity : User User differences can impact upon successful assess to content Interfaces for Supporting Innovation. ACM Transactions within large-scale cultural heritage digital libraries. Out of all of of Computer-Human Interaction. 7, 1, 114–138. these criteria though, it is likely that the novice / expert [8] Skov, M. 2009. The Reinvented Museum: Exploring differences are most likely to affect overall success in finding and Information Seeking Behaviour in a Digital Museum exploring content. Novice and expert users express somewhat Context. Mette Skov (thesis). Royal School of Library different preferences for tools to support exploration of digital and Information Science, Denmark. cultural heritage collections. They also make some different and [9] Skov, M. and Ingwersen, P. 2008. Exploring Information some similar choices when engaged in finding and creating Seeking Behaviour in a Digital Museum Context. material for content curation activities. As information seeking IIiX’08, Information Interaction in Context 2008, support systems for collections are increasingly targeted at a more London, UK, 110–115. diverse range of users from novice to expert in their range of [10] Stiller, J. 2012. A Framework for Classifying Interactions Domain Expertise on Web Search Behavior. WSDM’09, in Cultural Heritage Information Systems. International February 9-12 2009, Barcelona, Spain. Journal of Heritage in the Digital Era. 1, 1, 141–146. [14] Wildemuth, B.M. 2004. The Effects of Domain [11] Sweetnam, M.S. et al. 2012. User Needs for Enhanced Knowledge on Search Tactic Formulation. JASIST. 55, 3, Engagement with Cultural Heritage Collections. TPDL 246–258. 2012, Sept 23-27, 2012, Paphos, Cyprus, 64–75. [15] Wilson, M.L. et al. 2010. From Keyword Search to [12] Toms, E.G. and Mccay-peet, L. 2009. Chance Exploration: Designing Future Search Interfaces for the Encounters in the Digital Library. ECDL 2009, Sept 27- Web. Foundations and Trends in Web Science. 2, 1, 1– Oct 02, 2009, Corfu, Greece, 192–202. 97. [13] White, R.W. et al. 2009. Characterizing the Influence of