An Exploratory Search User Interface Concept Supporting Vague Querying and a Novel Result Representation Timo Lüddecke Markus Jüttner Marcus Nitsche Andreas Nürnberger Data and Knowledge Engineering Group, Faculty of Computer Science, Otto von Guericke University Magdeburg {timo.lueddecke, markus.juettner}@st.ovgu.de, {marcus.nitsche, andreas.nuernberger}@ovgu.de who is not necessarily capable of formulating suitable terms at the begin of the search, for instance because Abstract he is new to the domain. Search engines also lack in the ability to support the formulation of importance Common search engines deliver quite good re- of selected search terms as every term has potentially sults when the user has a precise notion of the same impact on the result (apart from context- what he is looking for. However, the user sensitivity). Also functions for explicitly excluding might have in mind additional prior infor- terms are either hidden - in most cases unknown to mation regarding the importance of specific users - or do not exist at all. A study on search query terms. Consequently, it seems desirable to logs conducted by Jansen et al. [9] found that the avoid the latter and incorporate the knowl- boolean operators ”NOT” or ”-” were only used in edge into the query instead. Therefore, we 3.34% of all queries. However, we believe that term propose a search user interface concept that exclusion could turn out to be useful in a much higher supports users in modelling their uncertainty number of cases. It appears that non-uniform term in a comfortable way, foster exploratory search importance, especially exclusion are desirable in nu- and provide a compact yet informative repre- merous scenarios, e.g. when searching for recipes with sentation of results. An implemented proto- a favourite and another nice-to-have ingredient while type demonstrates the feasibility of the con- being allergic to a third component. The weighting of cept. We also present results of a first two- a term does not necessarily encode it’s (known) rel- step usability study. The results indicate a evance by the user. It might also specify the user’s good usability of the concept and show that (un)certainty about the suitability of single terms. even this novel concept meets user’s expecta- Another aspect of search engines is the presentation tions. of results. It comprises in most cases just text contain- ing the title, a text snippet and the URL. This gives 1 Motivation no clue about the visual appearance of the actual doc- Modern search engines have become very powerful ument, which nevertheless could be helpful for a user’s tools, providing excellent results - even in areas beyond relevance estimation and recognition of previously vis- basic document queries like finding a nearby dentist or ited websites. checking for the weather of next weekend. However, In this paper, we present a concept designed to they require textual input of keywords by the user, overcome these disadvantages of current web search user interfaces by introducing a novel query formula- Copyright c 2014 for the individual papers by the paper’s tion mechanism and a compact representation of a web authors. Copying permitted for private and academic purposes. This volume is published and copyrighted by its editors. page’s content as well as visual appearance. Both be- In: U. Kruschwitz, F. Hopfgartner and C. Gurrin (eds.): ing integrated into a search user interface prototype Proceedings of the MindTheGap’14 Workshop, Berlin, Ger- that addresses some aspects of exploratory search [15] many, 4-March-2014, published at http://ceur-ws.org by providing support in expressing uncertainty. In [8] a brief overview of exploratory search tools and eval- uation techniques are provided. 2 Related Work The VIBE-system by [12, 17] also supports users in interactive finding and filtering relevant information. Here, magnets are used to attract relevant documents to specific screen points. Nitsche & Nürnberger [16] introduced QUEST - a user interface concept where terms are placed radially around a center with the distance to it encoding the uncertainty: The closer a term is, the higher it’s spe- cific weight in the whole search query. Also results Figure 1: The three system components: Frontend, are represented by small dots or favicons in the radial Backend (Apache Solr) and the Crawler. layout, whereby the distance to the center maps the relevance of the current query constellation. As only the top 600 pixels of the web page are considered as the distance is taken into account, an arbitrary an- most websites have a characteristic header. The pixel gle can be chosen without changing the semantics of a values are clustered by KMeans [14] into ten groups query formulation. Therefore, multiple arrangements and stored in a database. The computation of a salient of terms can encode the very same query, which might region is carried out by the algorithm of Achanta et be a shortcoming of this approach. It generates also al. [1], where saliency is defined as a pixel’s distance just a weak structure for user’s decision which search to the image’s mean colour in lab space. The saliency result to survey first. map is searched for areas of high saliency on multiple The problem of reducing a web page’s content to scales. The best candidate is selected and extracted a compact representation has been addressed in vari- from the screenshot. For text extraction the html con- ous publications [10, 19, 2]. These representations are tent is first converted to plain text by nltk [3] and fed - in most cases - based on a screenshot of the entire into the database. The text is further processed by re- web page or an extraction of a salient region combined moving stop-words and each remaining word is scored with the title, while being only remotely related to by it’s frequency in the Brown corpus [7], which con- the textual content. Both, evaluations conducted by tains roughly one million words. The score is com- Dziadosz & Chandrasekar [6] and Aula et al. [2] sug- puted similar to tf-idf calculation [11] by the following gest though that combining text and image enables the formula: ( user to judge relevance best. Dörk et al. [5] presented tf (w)log(|B|/tfB (w)), if w ∈ B an exploratory search environment with a result repre- score(w) = (1) tf (w)(log|B| + 1), else sentation heavily relying on zooming in various maps: in temporal, spatial and semantic domain. However, With B being the set of words in the corpus and tf with books, blogs and photos, only fairly structured respectively tfB the term-frequencies in the web page types of content are considered – at least compared to and in the Brown corpus. arbitrary web pages. Backend. Our database relies on Apache Solr with- out any profound modifications. Besides a typical 3 System text field, we added fields for additional features the crawler captures. Term weighting is implemented by Our system consists of three major components (Fig. the boosting mechanism of Solr. Communication with 1): A crawler, a backend and a frontend. This section the javascript-based frontend is realized via HTTP and deals with the first ones, while the frontend will be JSON encoding, which is natively supported by Solr. described separately in section 4. Crawler. The rich result representation prohibits the utilization of APIs of common search engines as 4 Search User Interface they deliver too little information about the web page The user interface consists of five main elements, with and crawling these in real time is infeasible given a query formulation and result representation as the reasonable number of results. Therefore, we devel- most innovative ones. Query formulation is placed oped our own crawler computing a colour histogram, a at the top of the screen and the result representation salient extract of the web page and a wordcloud as well below. Both cover the entire width. Below them a as the text document for indexing. First, a screenshot navigable web page preview is set, surrounded by nav- of the web page is taken. For the colour histogram only igation buttons to the left and right. The small result 1 3 2 4 5 Figure 2: Screenshot of the prototype featuring (1) the search bar, (2) result representation, (3) a restart button, (4) the browsable web page preview and (5) preview for the next result in the list. representation and the big preview follow the design to type. It is also possible to restart the entire search, pattern of “overview and detail” [4], while the nav- i.e. to remove all terms, by clicking a single button. igable web page previews can be seen as contextual Result Representation. Due to the elongated and cues in the result space. To facilitate getting started – compared to common search user interfaces without and to ensure conformity with the user’s expectation, preview – small result bar, the crucial goal in designing just a simple common-known text-box is presented at the result representation was to keep it as compact as start. After submitting an initial query, the layout possible and to allow a horizontal arrangement. Pre- transforms smoothly into the one shown in Fig. 2. viously seen web pages should be recognizable and the Query Formulation. In order to constitute a query, content of unknown web pages should be as obvious terms are first typed in the simple text-box as usual. as possible when looking at the result representation. After submitting the first query, all terms move to the Our approach consists of three different constituents left side without overlapping each other. Note that this (Fig. 4): implies a positive initial weighting. For a refinement and the expression of uncertainty, terms can be moved • A colour bar (1) on the left as well as the back- horizontally. Fig. 3 depicts how the arrangement of ground of the whole element indicate frequent terms affects the query semantics. With x being the colours of the respective web page. When colours position of a term in the interval [−1.0, 1.0], sgn(x) are known in advance, it allows to quickly redis- indicates whether the term is explicitly wanted or un- wanted in the result documents. |x| denotes the con- 2 parts containing 3 Cloud fidence of the former statement. Single terms can be Word important most text removed by triggering a small remove button that pops 1 up on mouse over. New terms can be added by click- ing the query bar at the position associated with the wanted weighting of the new term and simply starting confidence Figure 4: Top: Conceptual representation with (1) positve terms negative terms four main colours (including background), (2) a salient extract of the web page’s screenshot, (3) the word- -1.0 0 1.0 cloud indicating a topic by presenting frequent words. Bottom: Two result representations from our crawler Figure 3: Query mechanism with “confidence”: implementation. It can be seen how a web page’s back- Weighting terms positive and negative. ground colour affects the representation. Figure 5: The query bar containing the terms of the second evaluation task. cover a previously visited web page because colour is a pre-attentive attribute [13]. Otherwise, the Figure 6: Results of the formative evaluation (“vaca- bar at least provides useful cues on what to ex- tion” example): A dark box indicates a strong corre- pect, e.g. websites for children are often very lation between the respective values. colourful while a business website is likely to have black text on a white background. wordcloud rendering. We crawled two indices, a gen- eral one without restrictions (85 entries) and a special • An extract of the rendered web page’s screenshot one with travelling and recipe sites only (455 entries), (2) provides a small preview of the most salient where the feature of vague query formulation is a big region of the web page. This might also support benefit. recognition and could additionally serve as a hint An open question is how the system reacts in a for the web page’s topic. larger scale, but as we use Solr for storage and query handling, we are confident that the system scales well, • The wordcloud (3) - it’s computation is described possibly by utilizing Apache Hadoop [20]. The user in- above - gives a general overview of the website’s terface was not optimized to work in a mobile context content by putting an emphasis on words that oc- like on a tablet. But due to the use of standard tech- cur rarely in general but frequently in this docu- niques it also runs on a Google Nexus 7 (2013) with ment and are hence more likely relevant for the only minor drawbacks. current topic. By dragging a result representation into the query 5 Study Design bar (Fig. 5), the query can be manipulated depending We conducted two evaluations with 17 participants in on the results content. When there is an intersection total, i.e. nine respectively eight participants each: between wordcloud of the element and query terms, A formative evaluation guided us for some design de- they are shifted to the left giving them a more positive cisions. A summative one tested the final prototype effect. If there is no intersection, the most popular implementation. Note that the evaluation was origi- word in the wordcloud is added to the query. This way nally carried out in German and translated to English exploratory search is further supported in the proposed for this paper. search user interface concept. Implementation. The implementation of the de- 5.1 Formative Evaluation scribed concept is based on current web standards: Javascript for the logic and SVG using raphael.js1 for The entire formative evaluation was implemented as rendering graphics. Since the elements of the result an interactive form, where the study participants has representation are not retrievable from common APIs, been asked to interact with mock-ups of parts of the we had to make a decision between re-crawling the el- later implemented user interface. We offered a dis- ements in real-time as soon as results from an API crete and a continuous version of the query formulation are delivered or to build our own index with colours, mechanism (Fig. 5) and tried to assess which one is salient region and wordcloud directly stored. We de- easier to handle. Therefore, we created two challenges: cided in favour of the latter as the re-crawling takes Query formulation. The first task involved creating too much time and results could not be presented in- a query given the following brief note about the goals stantaneously. The index is based on an unchanged as well as the terms we wanted to be used: You are (except for configuration) Apache Solr2 server. It is looking for a destination for your hiking vacations in filled with content by our own crawler that captures a the mountains, not necessarily in the alps as you have screenshot for colour and a salient region extraction as been there before. As you suffer from vertigo you want well as it processes the html content, ending up with to avoid climbing. The results in Fig. 6 show that the term scores (see Formula 1) of the wordcloud. The test users were able to formulate a proper query, i.e. crawler is implemented in python using nltk [3], scikit- putting the relevant terms to the left side of the query learn [18] and various scripts, e.g. A. Müller’s3 for and the negative ones to the right side. Query understanding. To solve the second task, 1 http://raphaeljs.com (28.10.2013) users were asked to do the inverse. Given a final query 2 http://lucene.apache.org/Solr (28.10.2013) formulation, six different images needed to be ordered 3 https://github.com/amueller/word cloud (28.10.2013) or removed. Five of them were images of cakes, the sixth one was an image of a dog. This way, we wanted a) b) c) to see if the representation of a query in the query bar (Fig. 5) is understandable. Furthermore, it gives insight to a deeper interpretation of the participants: Should the dog be in the result list though it has no relation to the query terms? If yes, should it be placed in front of the unfitting results? The results reported in Fig. 7 suggest that the basic principle has been un- derstood as the rightmost images were correctly put 1 2 3 4 5 top of the results in most cases. Regarding the dog, the participants agreed on scor- Figure 8: User ratings of our prototype on a five ing it lower than all cakes. But there is a dissent on grade scale from strong green (good/easy) to strong whether to include the result or to remove it. red (bad/difficult) for the questions: a) How easy was Result representation. In addition to the query for- the transition from common search engines?, b) How mulation, we also evaluated prototypes of the result do you like the representation of the results?, c) How representation (Fig. 4). Four manually assembled useful did you find the weighting ability? representations of web pages were provided and we Furthermore, we found that the search bar can asked the study participants for possible search terms, be seen as a text field in the user’s mental model a category of the web page and which traits of the and could therefore support corresponding interactions representation were pivotal for that decision. Not all (i.e. placing a cursor and editing text). However, all participants filled out all fields. But when they did, participants considered the term weighting as a useful they correctly predicted the web page’s content, with tool and the majority liked the result representation only one exception. Often, the participants were able as well. The ratings shown in Fig. 8 indicate mi- to specify even the subtopic. nor problems regarding the transition from common user interfaces while both, result representation and 5.2 Summative Evaluation the weighting ability, are for most parts considered as The summative evaluation was carried out by giving good. the participants some task, while observing them and making notices. Afterwards, they were given the op- portunity to express feedback. 6 Conclusion In general, most users succeeded in working with We presented a novel search user interface concept for the search user interface. Minor problems involved exploratory web search addressing the problem of in- confusion between user interface and result represen- corporating uncertainty with respect to user’s confi- tation, colours and interpreting the plus/minus button dence while searching. The main contributions are a at the end of the scale as being actually a button. We novel query formulation mechanism and a compact vi- attribute this to the short time frame the participants sualization. This supports an efficient recognition. It had to get used to the prototype and its underlying also helps users to concern a web page’s topic by link- novel concept. Colours in the result representation ing visual and textual information. The implementa- indicating page colours are confused with the colour tion demonstrates the feasibility of the concept and scale for weighting a term. The plus respectively minus the small evaluation suggests that users are able to icon at the ends of the scale is sometimes mistakenly properly interact with the interface. interpreted as a button. Future work will cover the improvement of the sys- tem’s usability in practice. For instance, by offering a function to save interesting web pages and using more elaborative methods for visual and textual informa- tion extraction in the crawler. The compact represen- tation of results might also be interesting for mobile use, where screen space is limited. Acknowledgement Figure 7: Formative evaluation with “cake” example. We thank the flickr users tjstaab, freakgirl, lovebuzz, Kirti Poddar and jeff ro for releasing their images un- 4 Study participants saw copyright protected image. der a creative commons licence. References the 33rd International ACM SIGIR Conference on Research and Development in Information Re- [1] R. Achanta, S. Hemami, F. Estrada, and trieval, SIGIR ’10, pages 499–506, New York, NY, S. Susstrunk. Frequency-tuned salient region de- USA, 2010. ACM. tection. In Computer Vision and Pattern Recog- nition, 2009. CVPR 2009. IEEE Conference on, [11] K. S. Jones. A statistical interpretation of term pages 1597–1604, 2009. specificity and its application in retrieval. Journal of documentation, 28(1):11–21, 1972. [2] A. Aula, R. M. Khan, Z. Guan, P. Fontes, and P. Hong. A comparison of visual and textual page [12] S. L. Koshman. Vibe user study, 1997. previews in judging the helpfulness of web pages. In Proceedings of the 19th International Confer- [13] H. Levkowitz. Color theory and modeling for com- ence on World Wide Web, WWW ’10, pages 51– puter graphics, visualization, and multimedia ap- 60, New York, NY, USA, 2010. ACM. plications. Springer, 1997. [3] S. Bird, E. Klein, and E. Loper. Natural Lan- [14] J. B. MacQueen. Some methods for classifica- guage Processing with Python. O’Reilly Media tion and analysis of multivariate observations. In Inc., 2009. L. M. L. Cam and J. Neyman, editors, Proc. of the fifth Berkeley Symposium on Mathematical Statis- [4] A. Cockburn, A. Karlson, and B. B. Bederson. tics and Probability, volume 1, pages 281–297. A review of overview+detail, zooming, and fo- University of California Press, 1967. cus+context interfaces. ACM Comput. Surv., 41(1):2:1–2:31, 2009. [15] G. Marchionini. Exploratory search: From finding to understanding. Commun. ACM, 49(4):41–46, [5] M. Dörk, S. Carpendale, and C. Williamson. Apr. 2006. Fluid views: A zoomable search environment. In Proceedings of the International Working Con- [16] M. Nitsche and A. Nürnberger. Quest: Query- ference on Advanced Visual Interfaces, AVI ’12, ing complex information by direct manipulation. pages 233–240, New York, NY, USA, 2012. ACM. In S. Yamamoto, editor, Human Interface and the Management of Information. Information and In- [6] S. Dziadosz and R. Chandrasekar. Do thumbnail teraction Design, volume 8016 of Lecture Notes previews help users make better relevance deci- in Computer Science, pages 240–249. Springer sions about web search results? In Proceedings of Berlin Heidelberg, 2013. the 25th Annual International ACM SIGIR Con- ference on Research and Development in Infor- [17] K. A. Olsen, R. R. Korfhage, K. M. Sochats, M. B. mation Retrieval, SIGIR ’02, pages 365–366, New Spring, and J. G. Williams. Visualization of a doc- York, NY, USA, 2002. ACM. ument collection: the vibe system. Information Processing & Management, pages 69–81, 1993. [7] W. N. Francis and H. Kucera. Brown corpus man- ual. Brown University Department of Linguistics, [18] F. Pedregosa, G. Varoquaux, A. Gramfort, 1979. V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Van- [8] T. Gossen, M. Nitsche, S. Haun, and derplas, A. Passos, D. Cournapeau, M. Brucher, A. Nürnberger. Data exploration for biso- M. Perrot, and E. Duchesnay. Scikit-learn: Ma- ciative knowledge discovery: A brief overview of chine learning in python. J. Mach. Learn. Res., tools and evaluation methods. In M. R. Berthold, 12:2825–2830, Nov. 2011. editor, Bisociative Knowledge Discovery, volume 7250 of Lecture Notes in Computer Science, [19] J. Teevan, E. Cutrell, D. Fisher, S. M. Drucker, chapter Part IV, pages 287–300. Springer Berlin G. Ramos, P. André, and C. Hu. Visual snippets: Heidelberg, 2012. Summarizing web pages for search and revisita- tion. In Proceedings of the SIGCHI Conference [9] B. J. Jansen, A. Spink, and T. Saracevic. Real life, on Human Factors in Computing Systems, CHI real users, and real needs: a study and analysis of ’09, pages 2023–2032, New York, NY, USA, 2009. user queries on the web. Information processing ACM. & management, 36(2):207–227, 2000. [20] T. White. Hadoop: The Definitive Guide. [10] B. Jiao, L. Yang, J. Xu, and F. Wu. Visual O’Reilly, first edition, 2009. summarization of web pages. In Proceedings of