Just looking around: Supporting casual users initial encounters with Digital Cultural Heritage David Walsh Mark M. Hall Department of Computing Department of Computing Edge Hill University Edge Hill University St Helens Road St Helens Road Ormskirk, L39 4QP Ormskirk, L39 4QP United Kingdom United Kingdom walshd@edgehill.ac.uk mark.hall@edgehill.ac.uk ABSTRACT in storage rather than on display. To address this limitation Cultural Heritage institutions have developed numerous ways CH institutions have been engaged in a massive digitisation of supporting visitors who have simply wandered in through process, making large parts of their holdings available to the the front door. However, for their digital collections, the CH public on their websites. institutions mostly provide a simple search-box, which sup- Access to these collections is primarily provided through ports the expert user, but which does not support the casual the search box. This works very well for expert users who user who has just stumbled across the collection. These ca- know exactly what they want and which keywords they need sual users frequently have no goal or topic in mind, but just to use to find what they are looking for [25]. However, CH want to have a look around what is available in the collec- institutions also have to support the casual user who has just tion. For these users the blank search-box presents a signif- stumbled across their collection in the same way that they icant obstacle, as without a goal or topic it is very difficult would wander into the CH institution’s physical space. For to formulate an appropriate query. In this paper we pro- these casual users, who have no specific goal in mind when pose extending current exploratory search and information they come to the collection, the blank search-box presents seeking models to support the initial interaction between the an almost insurmountable problem [30]. To access anything casual user and the collection. in the collection, they are required to know at least one keyword and these users are not provided with any alter- native way to find what might be available and develop an Keywords overview of the collection [15, 23, 14]. This is illustrated by information seeking, information retrieval, casual user, ex- the following quote from a casual user of such a collection: ploration So what use are the digital libraries, if all they do is put digitally unusable information on the 1. INTRODUCTION web. [5] Cultural Heritage (CH) institutions such as museums and galleries are accustomed to dealing with visitors who arrive There is thus a clear need to support these casual users at their doors with no knowledge of what they could see or in their use of the collection. However, in the information even what they would like to see. At the moment of ar- seeking (IS) and information retrieval (IR) domains, the fo- rival the only goal these visitors have is spending some time cus has always been on users with a more or less clearly in the museum or gallery. CH institutions have developed defined goal, as the lack of goal breaks the fundamental as- a number of successful strategies for supporting these visi- sumptions in all current IS and IR models [12, 31]. Much tors by providing them with floor plans, fliers, guide-books, work has been done on supporting exploration and discovery audio-guides, and guided tours. when the user has at least a very vague goal [13, 17, 7], but The major limitation that CH institutions face is that there remains a gap in our understanding around that first, the physical space available for displaying the institution’s casual interaction between the user and the collection. artefacts and the time required to curate the various intro- ductory guides severely limit the number of artefacts that can be shown to visitors, with the majority of artefacts held 2. BACKGROUND Opening the CH institutions’ archives to the world through digitisation makes them available to a much wider audience than just the museum curators and CH researchers [15]. With this expanded audience comes the requirement of sup- porting users outside the core group of experts, who have previously explored the CH institutions archives, to include Copyright c 2015 for the individual papers by the papers’ authors. Copy- the casual user [9, 27]. This presents a problem to current ing permitted for private and academic purposes. This volume is published information retrieval models and techniques, as these are and copyrighted by its editors. ECIR Supporting Complex Search Task Workshop ’15 Vienna, Austria generally built around the concept that the user has some Published on CEUR-WS: http://ceur-ws.org/Vol-1338/. kind of goal, however vague, in mind. The reasons users turn to digital information systems cover the user is interested in. Clustering and visualization ap- the whole spectrum from (re-)locating a piece of information proaches can deal with the amount of data, but the resulting they know exists to exploring an unknown topic to develop visualizations tend to suffer from information overload and an understanding [6]. These interactions with digital infor- do not provide a usable overview over the large collections. mation systems can roughly be classified as known-item or Similarly, faceted interfaces can process the amount of data known-topic searches, where the user knows what they are available, but DCH collections are very heterogeneous [15] looking for and what they expect to see, and exploratory and showing the most frequent 20 or 30 keywords does not search interactions, where the goal is to explore, learn, in- give the user an overview and access to more than a very terpret, synthesize, and understand [18, 28, 30]. small fraction of the total content. The traditional IR model describes a simple loop consist- More importantly, however, is that all these theories and ing of problem identification, query formulation, and result interfaces start with the assumption that the user has at evaluation [25], which successfully supports the known-item least a very vague goal in mind. They do not model or and known-topic search tasks [29]. To support the more support the completely undirected casual user. open-ended exploration interactions, this basic model has been expanded to create exploratory search models [18, 20], which have much wider scope, complexity, and duration [19, 3. SUPPORTING THE CASUAL USER 3, 28]. To support the casual user in their initial interaction with These models all treat the search process as if it is com- the collection, the major change we propose is to let go of pleted in a single session. However, the process of satisfying the concept of the “information need” as the reason for in- an information need will often extend over multiple search teracting with an information system. The casual user has sessions as the users slowly develop and refine their precise a motivation for coming to the CH institution’s site, but understanding of what they are looking for. A number of this it not necessarily a need for information, they might models of this extended process have been created to de- just want to procrastinate . The focus for supporting the scribe this information seeking journey [16, 26]. These mod- casual user has to shift from supporting them in exploring els all describe a process in which the user starts with a and finding what they are looking for to supporting them in very vague notion of what they are looking for and what the understanding what is available in the collection and where journey’s end-point will be. Then, as the user interacts with they might start browsing. the search system, they develop a clearer understanding of We envision a number of different interfaces that could their information need and their searches become evermore enable such access. One approach would be to generate tex- focused until they develop the final queries that satisfy their tual summaries that describe the type of content available information need. from the collection. Such an approach would need to analyse The final phases of the information seeking journey are the individual items meta-data using an algorithm such as generally well supported by the traditional search model LDA [4], then combine that with a textual resource such as and interfaces. For the earlier, more open-ended stages, Wikipedia to generalise the topics, and finally generate tex- a number of exploratory interfaces have been developed. tual descriptions such as “The collection contains historical Hierarchical systems [10] were intended to help organize artefacts from ancient Egypt, space exploration, horology, large sets of documents into groups or categories [8] en- and a modern collection of oceanographic specimens.” The abling searchers to perform more sophisticated browsing tac- user could then click on any of the topics to get a summary tics such as traversing and exploring nearest neighbour cat- of the content in the selected area of the collection, enabling egories [2, 30]. Clustering approaches [7] group together them to freely explore. related documents to give the user an overview over the An alternative would be to use the topic structure to gen- “topics” in their search results. Faceted classification [24, erate an exploratory semantic map, that the users can inter- 18, 13, 21] generates a list of the most frequent keywords act with and explore like they would a physical map. An- for the collection (or search result) and shows these to the other approach could be to look at developing an automatic user. The user can then explore the collection by clicking measure for the “interestingness” of items in the collection. on the keywords, rather than having to type them into the This could then be used to sample items from the collection search box. Tag-clouds provide a similar visualization of the to show the casual user the “highlights” of the collection. most frequent keywords. Socially curated systems [22, 11, The investigation of potential interfaces will have to be 1] allow users to curate their own mini-exhibitions and then accompanied by a series of user studies that investigate how share these with other users, providing the new and casual casual users develop a topic they are interested in, when user with a starting point for exploring the collection. confronted with a new collection. This will enable us to These approaches all suffer from a number of technical extend the existing models for exploratory search and infor- limitations, primarily around the difficulty of scaling to the mation seeking by providing a more detailed understanding massive amount of information that is available in modern of the initial phase in which the user develops their informa- Digital Cultural Heritage collections. The manual processes tion need. This extension will enable information systems that create hierarchical systems cannot deal with the mil- to support the complete information journey, from the de- lions of items that exist in modern big-data DCH collections velopment of the information need to its final fulfilment. and that need to be classified. Socially curated systems suf- Finally, while in DCH this issue is particularly prevalent, fer from the same lack-of-manpower issue and additionally understanding the casual user who has no immediate need to provide a comprehensive overview over a collection, they could also have significant impact in the area of E-commerce. would require so many mini-exhibitions that they simply E-commerce is a major growth area, but currently does not replace the problem of finding an item that the user is inter- support browsing the available things in the same way that ested in with the problem of finding a mini-exhibition that you can browse through a shop. 4. REFERENCES Basic books, 1988. [20] P. Pirolli. Powers of 10: Modeling complex [1] E. Agirre, N. Aletras, P. Clough, S. Fernando, information-seeking systems at multiple scales. P. Goodale, M. Hall, A. Soroa, and M. Stevenson. Computer, 42(3):33–40, 2009. Paths: A system for accessing cultural heritage [21] P. L. Schmitz and M. T. Black. The delphi toolkit: collections. In ACL (Conference System Enabling semantic search for museum collections. In Demonstrations), pages 151–156. Citeseer, 2013. Museums and the Web 2008: the international [2] M. J. Bates. Information search tactics. Journal of the conference for culture and heritage on-line, 2008. American Society for information Science, [22] F. M. Shipman, R. Furuta, D. Brenner, C.-C. Chung, 30(4):205–214, 1979. and H.-w. Hsieh. Guided paths through web-based [3] M. J. Bates. The design of browsing and berrypicking collections: Design, experiences, and adaptations. techniques for the online search interface. Online Journal of the American Society for Information Information Review, 13(5):407–424, 1989. Science, 51(3):260–272, 2000. [4] D. M. Blei, A. Y. Ng, and M. I. Jordan. Latent [23] M. Skov and P. Ingwersen. Exploring information dirichlet allocation. the Journal of machine Learning seeking behaviour in a digital museum context. In research, 3:993–1022, 2003. Proceedings of the second international symposium on [5] C. L. Borgman. The digital future is now: A call to Information interaction in context, pages 110–115. action for the humanities. Digital humanities ACM, 2008. quarterly, 3(4), 2009. [24] D. A. Smith, A. Owens, A. Russell, C. Harris, [6] K. Byström and K. Järvelin. Task complexity affects M. Wilson, et al. The evolving mspace platform: information seeking and use. Information processing & leveraging the semantic web on the trail of the memex. management, 31(2):191–213, 1995. In Proceedings of the sixteenth ACM conference on [7] C. Carpineto, S. Osiński, G. Romano, and D. Weiss. A Hypertext and hypermedia, pages 174–183. ACM, 2005. survey of web clustering engines. ACM Computing [25] A. Sutcliffe and M. Ennis. Towards a cognitive theory Surveys (CSUR), 41(3):17, 2009. of information retrieval. Interacting with computers, [8] M. Chen, M. Hearst, J. Hong, and J. Lin. Cha-cha: A 10(3):321–351, 1998. system for organising intranet search results. 1999. [26] P. Vakkari, M. Pennanen, and S. Serola. Changes of [9] A. S. Cifter and H. Dong. User characteristics: search terms and tactics while writing a research Professional vs. lay users, 2009. proposal: A longitudinal case study. Information [10] E. Clarkson, K. Desai, and J. D. Foley. Resultmaps: processing & management, 39(3):445–463, 2003. Visualization for search interfaces. Visualization and [27] P. Vilar and A. Šauperl. Archival literacy: Different Computer Graphics, IEEE Transactions on, users, different information needs, behaviour and 15(6):1057–1064, 2009. skills. In Information Literacy. Lifelong Learning and [11] M. Hall, P. Goodale, P. Clough, and M. Stevenson. Digital Citizenship in the 21st Century, pages The paths system for exploring digital cultural 149–159. Springer, 2014. heritage’. In Clare Mills, Michael Pidd and Esther [28] R. W. White and S. M. Drucker. Investigating Ward. Proceedings of the Digital Humanities Congress, behavioral variability in web search. In Proceedings of 2012. the 16th international conference on World Wide Web, [12] M. Harvey, M. Wilson, and K. Church. Workshop on pages 21–30. ACM, 2007. searching for fun 2014. In Proceedings of the 5th [29] R. W. White and R. A. Roth. Exploratory search: Information Interaction in Context Symposium, IIiX Beyond the query-response paradigm. Synthesis ’14, pages 6–6, New York, NY, USA, 2014. ACM. Lectures on Information Concepts, Retrieval, and [13] M. A. Hearst. Clustering versus faceted categories for Services, 1(1):1–98, 2009. information exploration. Communications of the [30] M. L. Wilson and D. Elsweiler. Casual-leisure ACM, 49(4):59–61, 2006. searching: the exploratory search scenarios that break [14] K. Hornbæk and M. Hertzum. The notion of overview our current models. In Proceedings of HCIR, pages in information visualization. International Journal of 28–31, 2010. Human-Computer Studies, 69(7-8):509 – 525, 2011. [31] C. Ye and M. Wilson. The characteristics of casual [15] A. Johnson. Users, use and context: supporting sessions in search behaviour logs, 2014. interaction between users and digital archives. What Are Archives?: Cultural and Theoretical Perspectives: A Reader, pages 145–64, 2008. [16] C. C. Kuhlthau. Inside the search process: Information seeking from the user’s perspective. JASIS, 42(5):361–371, 1991. [17] C. D. Manning, P. Raghavan, and H. Schütze. Introduction to information retrieval, volume 1. Cambridge university press Cambridge, 2008. [18] G. Marchionini. Exploratory search: from finding to understanding. Communications of the ACM, 49(4):41–46, 2006. [19] D. A. Norman. The psychology of everyday things.