Towards User-Centered Retrieval Algorithms Manuel J. Fonseca Department of Computer Science and Engineering INESC-ID/IST/Technical University of Lisbon R. Alves Redol, 9, 1000-029 Lisboa, Portugal mjf@inesc-id.pt ABSTRACT not be able to find what they want or they may not even be Nowadays almost all retrieval algorithms (for text, images, able to submit a query to the system. drawings, etc.) are mainly concerned in achieving good For illustration purposes let us consider the following hy- system-centered measures, such as precision and recall. How- pothetic scenario: “We developed a system for retrieving ever, these systems are used by users, who try to achieve generic complex vector drawings, like for instance techni- goals through the execution of tasks. To better satisfy the cal drawings, architectural plants or clipart drawings. We users’ needs we must involve them in the development pro- evaluated it using query-by-example and a set of predefined cess of the retrieval systems. drawings, achieving a good precision and recall measure. Af- In this paper, we argue that a user-centered approach, terwards, when we delivered the system to users, we noticed where users are included in the development cycle of the that they were not able to use it, because they could not find overall retrieval system, can lead to improved retrieval algo- the (first) drawing that they must use as query to find the rithms and also to a better user satisfaction while using the desired drawing. Moreover, users do not want to search for system. the complete drawing, but only by a subpart of the drawing.” This scenario could be avoided if before we developed the retrieval system we asked users what were their needs, what Categories and Subject Descriptors did they want to perform on the system and how they want H.3.3 [Information Storage and Retrieval]: Information to do it. To collect all this information we need to apply Search and Retrieval; H.5.2 [Information Interfaces and a user-centered approach where users are involved in the Presentation]: User Interfaces - Graphical user interfaces development of the retrieval system and algorithms. (GUI) In this paper we defend an user-centered approach as a way to create better retrieval algorithms and improve the overall retrieval system. We start by shortly describe the General Terms user-centered approach and the iterative cycle used in the Design, Human Factors user interface design. In Section 3 we describe our appli- cation of the user-centered approach in the development of Keywords retrieval algorithms. Finally, we present some conclusions. User-Centered Design, User-centered approach, Retrieval al- gorithms 2. USER-CENTERED DESIGN The user-centered design (UCD) is a design methodology, 1. INTRODUCTION where the needs, skills and limitations of the users are taken The majority of the retrieval algorithms, whether they into account during all stages of the development of the sys- are for text, images, drawings, 3D objects, audio, video, etc., tem. The key premise of the user-centered design is that are mainly interested in performing well for system-centered the active involvement of the users in the development pro- measures, like for instance precision and recall. However, cess as well as in the evaluation of the interactive products these systems are used by users who want to perform spe- can lead to well-designed systems that best meet the desired cific tasks and achieve specific goals. We can develop a good usability goals. These systems will take advantage of users retrieval system, that performs well against a predefined skills, will be relevant to their work and activities, and will ground truth, but when we delivery it to users they may help them rather than constrain their actions. One of the principles from the UCD [4] states that we first need to identify who the users will be (profile, skills limitations, etc.) and what tasks they perform and/or wish to perform. The second principle mentions that the systems should be exposed to users in the early stages of development to collect feedback from them. Finally, the third principle is Copyright c 2011 for the individual papers by the papers’ authors. Copy- iterative design. The results and feedback from user testing ing permitted only for private and academic purposes. This volume is pub- should be used to fix and improve the system. The UCD lished and copyrighted by the editors of euroHCIR2011. EuroHCIR ’11 Newcastle, UK assumes an iterative cycle with identification of the users’ . needs, design of the solution and evaluation, repeated as often as necessary, as depicted in Figure 1. (system and user centered measures) should be used to im- prove the system and to refine the user and functional re- !"#$%&'(%)&"*% quirements of the retrieval system. +'&,-"."% One of the things that we observed in one evaluation ses- sion with users, was that users did not care about where in the order of retrieval the intended drawing appears, the important fact being that it was there. One of the users pro- duced this comment “It [the system] found it [the drawing]! That is what counts!” However, when we evaluate retrieval systems, the majority of the existing measures and ground 89&,1&20'% /0,120'% truth datasets privilege precision. Of course this system- 3#".4'%&'(% centered evaluation is important, but we should also take 5$0606-7.'4% into account the users perspective, where they privilege re- call. 3.1 An Example Involving the users can affect the way we develop the re- Figure 1: User-centered design iterative cycle. trieval algorithms. In recent years we developed a generic approach for complex vector drawing retrieval, based on the topology and geometry of the elements present in the draw- 3. USER-CENTERED RETRIEVAL ing. These two features were used to describe the content Typically when we want to develop a new retrieval ap- of the drawings, and during matching, we first compare the proach, we look at the media to retrieve (text, audio, video, drawings using topology and them we compare the geome- drawings, images, etc.), identify the features that better de- try of those with similar topologies, giving the same weigh scribe the media, create a matching algorithm and finally to both features (for more details see [1]). This generic re- we compute precision and recall. Although this methodol- trieval approach was used to develop one system for retriev- ogy allows us to create retrieval systems, we believe that by ing technical drawings [3] and another for retrieving clipart including the user in the development cycle will allow us to drawings [2]. deliver better and more usable retrieval systems, that will Before we developed this solution and the two retrieval allow users to achieve their goals and not only systems that systems, we performed user and task analysis to understand have a good precision and recall performance. how users wanted to make queries to this type of systems. Moreover, we should not develop retrieval systems, and We notice that they prefer to draw sketches of the drawing that includes descriptor computation, matching algorithms that they were looking for than to submit an existing draw- and presentation of the results, without first identifying a ing to perform a query-by-example. Moreover, most of the set of user needs and functional requirements (first step in times they do not have a drawing similar to the one that the user-centered design). We need to know our users, their they are looking for. skills, their background, their profile. We must identify their The two systems were both evaluated with users, and from needs and requirements, their goals and how they achieve those evaluations we observed that the way users search for them. In summary, we need to do an user and task analysis technical drawings was different from the way they search before we start developing our retrieval system. User and for clipart drawings [6]. While in the case of technical draw- task analysis should not only influence the design of the ings users draw more complete sketches with several visual user interface, but also the design of the retrieval approach elements, and consequently defining a richer topological con- or algorithm. For instance, users could use various strategies to perform a search in a drawing retrieval system. They could use a drawing that they already have, in a file, to search for sim- ilar drawings using query-by-example, or they could draw a sketch of the drawing that they want to find. As we can see, the retrieval solution (feature extraction, indexing and matching algorithms) will be different on each case. While in the first case we only need to compare two drawings of the same complexity and with the same characteristics (sets of lines and polygons), in the second case we need to com- pare complex drawings with sketches (typically simpler and with less elements). Thus, the way users perform the task to achieve their goal influence the retrieval approach that we should develop. After developing the retrieval solution based on the user requirements, we should evaluate the retrieval system, using not only system-centered measures, but also user-centered measures, such as time to complete tasks, error rates, sat- isfaction, etc. As in the user-centered design of interactive Figure 2: Sketch specifying a query to find a tech- systems, results from the evaluation of the retrieval system nical drawing. 4. CONCLUSIONS In this paper we defended a user-centered approach for the development of retrieval systems. As in the case of user interfaces design, also for retrieval systems is important to know our users, adapt the algorithms to them, and involve the users in the evaluation of the system. We believe, and we had confirmed, that the involvement of the user in the development cycle of retrieval systems can conduct to better systems that satisfy users needs and are Figure 3: Sketch specifying a query to find a clipart more adapted to them. drawing. 5. ACKNOWLEDGMENTS This work was supported by FCT through the PIDDAC figuration, as illustrated in Figure 2; for clipart drawings, Program funds (INESC-ID multiannual funding) and the users produced simpler sketches, with fewer elements and Crush project, PTDC/EIA-EIA/108077/2008. with a poorer topological description (see Figure 3). Due to this observation during tests with users, we refine our retrieval algorithm for retrieving clipart drawings [5], 6. REFERENCES putting more emphasis on the geometry than on topology. [1] M. J. Fonseca. Sketch-Based Retrieval in Large Sets of With this change we were able to achieve a better precision Drawings. PhD thesis, Instituto Superior Técnico / and recall measure for clipart drawings, and we adapted our Technical University of Lisbon, July 2004. retrieval system to the users’ way of sketching queries. [2] M. J. Fonseca, B. Barroso, P. Ribeiro, and J. A. Jorge. Retrieving clipart images by content. In Proceedings of 3.2 Discussion the 3rd International Conference on Image and Video We can not develop our retrieval algorithms without in- Retrieval (CIVR’04), volume 3115 of Lecture Notes in volving our users into the development cycle. As in the Computer Science, pages 500–507. Springer-Verlag, design of interactive systems, also in the development of re- Dublin, Ireland, July 2004. trieval systems we must involve the users. [3] M. J. Fonseca, A. Ferreira, and J. A. Jorge. They must be involved in the initial phase, so we can Content-based retrieval of technical drawings. understand how they search for the information, what are International Journal of Computer Applications in their knowledge, what are their limitations and what is their Technology (IJCAT), 23(2–4):86–100, 2005. profile. With this we are able to identify users needs and [4] J. D. Gould and C. Lewis. Designing for usability: key functional requirements. principles and what designers think. Commun. ACM, Later on, during the development of the algorithms we 28(3):300–311, 1985. should take into account this input and adapt the algorithms [5] P. Sousa and M. J. Fonseca. Geometric matching for to provide “good results” for ”our” users, and not for the users clip-art drawing retrieval. Journal of Visual in general, or for the system. Communication and Image Representation (JVCI), Finally, during the evaluation stage, besides computing 20(2):71–83, February 2009. the traditional system-centered measures, for a set of datasets [6] P. Sousa and M. J. Fonseca. Sketch-based retrieval of defined as ground truth, we should also involve users in the drawings using spatial proximity. Journal of Visual evaluation to collect quantitative and qualitative measures. Languages and Computing (JVLC), 21(2):69–80, April Information gather during evaluation should be used to im- 2010. prove the retrieval algorithms and the overall retrieval sys- tem, in the next iteration of the iterative cycle of the user- centered approach.