ReCollection: a Disposal/Formal Requirement-Based Tool to Support Sustainable Collection Making Francis Rousseaux, Alain Bonardi, and Kevin Lhoste IRCAM, Place Stravinsky, 75004, Paris, France {francis.rousseaux,kevin.lhoste}@univ-reims.fr alain.bonardi@ircam.fr http://www.ircam.fr Abstract. Modern Information Science deals with tasks which include classifying, searching and browsing large numbers of digital objects. The problem today is that our computerized tools are poorly adapted to our needs as they are often too formal: we illustrate this matter in the rst section of this article with the example of multimedia collections. We then propose a software tool, ReCollection, for dealing with digital collections in a less formal and more sustainable manner. Finally, we explain how our software design is strongly backed up by both artistic and psychological knowledge concerning the ancient human activity of collecting, which we will see can be described as a metaphor for categorization in which two irreducible cognitive modes are at play: aspectual similarity and spatio- temporal proximity. Key words: information retrieval, cognitive modeling, gural collec- tion, class, spatial metaphor 1 Multimedia Collections 1.1 Technological Context Our modern WIMP-based interfaces were created in the early 70s, they were used on computers with low storage capacities, slow processing speed, relatively low connectivity and low resolution monitors. These computers were rst used in oces and administrations, where the desktop metaphor tted very well. Then, personal computers brought this kind of hardware to people's homes, and the desktop metaphor still tted as computers were mainly used for editing and ling documents. Since those times, the technology has leaped forward, and today a large portion of the population uses a computer and connects to the internet on a daily basis. Here in France1 , 9 out of 10 people in the 18-24 age group use 1 , phone survey by TNS SOFRES for the group Casino / Les Français et l'ordinateur L'Hémicycle, 15-16/04 2005. 2 ReCollection: Sustainable Collection Making a computer and the internet daily. The contents can be downloaded from the internet, or imported from digital devices such as cameras, which have also become mainstream. Not surprisingly, a huge market has emerged from these multimedia collec- tions. We can now choose from a myriad of computerized tools which assist us in nding, retrieving, recording, creating, editing, browsing and classifying mul- timedia contents. The variety of tools at hand seems to t with the variety of uses involved in multimedia computing, from the most creative ones - such as graphic design, audio synthesis, etc - to the most formal ones - classication in particular. However, there doesn't seem to be many tools bridging the gap between these two seemingly opposing polarities. 1.2 Collecting: Between Formalism and Creativity Let us illustrate this situation. First, let us suggest that looking for new material and classifying are two important processes involved in collecting. Indeed, when someone decides to start building a collection he usually already possesses a few items. Then, to extend this collection, new items must be added. In order to do so, the collector goes into the world and looks for these new items. Then as the collection builds up, the need to arrange the items into categories will become clearer, as the collection cannot simply remain a messy stack of unordered items. If he had decided to collect digital music, and go online to nd new items for his collection, the process would have been rather similar. Commercial music download sites allow the user to browse through predened music categories, thus implementing a kind of virtual record shop with the same problems mentioned earlier. The search tool however can come in handy, and allow the user to search for the name of an artist, a song, an album or even musical genre. All these are still editorial information, which aren't necessarily the most useful to the collector. Then, when the music is downloaded, the album consists of a group of compressed audio les, containing preset meta-tags, again storing editorial information. When browsing these les in his audio player, the songs are dened and classied automatically, not always according to the collector's desires. His nal attempt is then to create a set of folders on his disk, and arrange his items in these folders. But how does he name these folders? What if he wants to arrange and browse the items in multiple ways? What if a particular item doesn't t in any folder, or could be placed in two or three dierent categories? Pachet has also described many problems in the area of Electronic Music Distribution [?]. As we see from this example, the tools that the everyday user has at hand are too formal, and are poorly adapted to the growing activity of collecting multimedia contents. Indeed, what we have said for music can also be said for the other kinds of media, and can also be said for information research, le sharing, etc. Attempts have been made at putting the human user back in control of the collecting process, rather than relying purely on predened categories and automated research algorithms. However, it has become obvious that the other extreme of handing complete control over to the user isn't optimal either. Let us ReCollection: Sustainable Collection Making 3 take a look at online content sharing sites, such as the famous FlickR™. There is no categorization here, but there are three main strategies when looking for photos: date, location, tags. The rst two are self-explanatory, but the tags are more interesting here. When someone uploads a photo to the website, they can link a certain number of keywords, called tags, to this photo. Then, we can either browse through the most popular tags, or type a tag into a textbox for a more precise search. The users then have complete freedom on the way they choose to dene their photos. But the problem is that many photos aren't tagged, and the photos that are, often have poorly named tags, making them dicult to retrieve. Therefore, we believe that an optimal solution to the problem of digital collections could lie somewhere between these two polarities: predened categories and total user creativity. 1.3 Examples of Tools Attempting to Bridge the Gap MusicBrowser is a software which aims at indexing large and unknown music collections, and also helping the user nd "interesting" music in these collections [?]. When digital sound les are imported into the system, they are analyzed, and a database of their acoustic properties is created / updated. Then the user can browse through the collection in a traditional manner, relying on editorial information. He can also create his own categories intuitively. He starts by creat- ing a category, and giving it a name. This can be totally subjective if he wishes, he may call it "evening music", "happy music" or "favorite", etc. He then adds a few songs to this category, before asking the program to nish classifying, based on acoustic similarities. Of course, the more categories there are, and the more examples there are, the easier it is for the system to classify the entire collec- tion. However, if there are mistakes, the user may simply move a song from one category to another, and ask the system to start again. This creative feedback loop, between user input and automated algorithms, will eventually lead to a satisfying classication for the user, who will have saved a lot of time in the process. He will then be able to create other classications of the same collection if he wishes, and switch instantly between any of them. He may also share these classications or download others. IMEDIA is a research project focused on indexing large collections of photos, and interactive searching and browsing [?]. When photos are added to the sys- tem, they are analyzed and a database of visual descriptors is created / updated. One of the main features of the program is allowing the user to search for similar photos. At rst, a list of random images from the collection is displayed, the user may browse them, or view another set of random images. When he sees a photo he likes, he can select it and ask the system to nd similar ones. For example, if he chooses a photo of a beach, then the system will display a list of photos of beaches. Once again, if the user isn't completely satised with the results, a "relevance feedback" system allows him to select the errors, and the system will take this into account in order to display a more relevant list of results. 4 ReCollection: Sustainable Collection Making As we shall see in the next section, we have tried to create a program more suitable to the particular process of collecting, which has an element of subjec- tivity, evolves over time and doesn't rely purely on similarities, as in the IMEDIA system for example. 2 ReCollection: An Experimental Software For The Creation Of Multimedia Collections ReCollection is a computer program for searching, arranging and browsing dig- ital content. As our collecting activities vary from one context to another, it is too ambitious to seek a general solution to the problem. Rather, particular ap- plication areas must be dened and isolated, in order for a specic answer to be given, however always relying on a set of basic principles. Here, we shall discuss the software prototype we have created for the digital opera / open form opera Alma Sola2 . 2.1 A Useful Metaphor: the Art Collection Artists and philosophers have described some very particular characteristics of collections. One of those, as noted by Wajcman[?], is that of excess in a collection. This means that the number of collected items exceeds the collector's capacity of memorization, but also of physical storage and exposition in the gallery. Thus, there is a need for at least one reserve, where the excess can be stored. For example, the George Pompidou National Museum of Modern Art, Paris, owns about 59000 artworks, making it one of the largest modern and contemporary art collections in Europe. Obviously, all the items cannot be exposed in the galleries at once, so a very large portion is stored in the reserves. Often, the items in reserve are stored in heaps, in random locations, and they aren't always labeled, which makes it dicult to nd and retrieve objects. The reserve allows us to handle the excess in collections, which is a problem in many of today's computer applications. Our multimedia collections, for example, are becoming very large and we are often losing control over them. On the other hand, objects which are currently exposed are found in the gallery. Here, the objects follow a spatio-temporal arrangement dening a nite number of visitation paths. The closeness in space of certain artworks and the chronological order in which they are approached are set carefully by the cura- tor, as they strongly inuence the visitors' experience. This aspect is also very important, and we shall discuss it later in detail. 2.2 The Reserve The ReCollection software has two main modes: reserve and gallery. The reserve allows us to store our objects which aren't exposed in the gallery. There are 2 Designed by Alain Bonardi, IRCAM, Paris and performed at Le Cube, Issy les Moulineaux, October 2005. ReCollection: Sustainable Collection Making 5 many objects in the reserve, and these are not always labeled; also they are rarely arranged in an orderly and tidy manner. So when we visit the reserve, we have no choice but to wander around, picking up objects, inspecting and identifying them one at a time. The reserve can also be compared to the attic, in which our family possessions are stored similarly. As we explore our attic, we can happen to pick up an old photo album, which we had completely forgotten about. This item will surely bring back memories and emotions. We can then choose to keep this album under our arm, as we continue to explore the attic, or we can leave straight away, and put it on our replace, for example, making it visible to visitors. It is all these pleasant and familiar experiences which we believe can be recreated thanks to the modeling of the reserve in our computer program. 2.3 The Gallery A collective activity involving a number of objects at a time is their relative arrangement in the gallery space. To the location of objects in this space, we have added their color; these two properties make up an extra conceptual layer which is the framework for the creation and management of our collections. In ReCollection, there is always at least one gallery, and the user can create as many as he wishes. There is always at least one item in a gallery, some basic content that the user can interact with, a starting point for his collection. The objects can be placed and arranged manually in the gallery space, using click and move, just as in common user interfaces. The user can also rely on two algorithms to automatically dispose the objects. The rst one, inspired by cataRT software [?], calculates the objects' positions and colors according to descriptors chosen by the user. The second calculates the positions depending on a sample of objects selected by the user. A Principal Components Analysis (PCA) nds out which descriptors vary most amongst the objects of the sample, the system can then rearrange the whole gallery according to these descriptors, as in the rst method. The arrangements resulting from the algorithmic calculations can always be modied manually in order to correct them (in the eventuality of rather subjective descriptors), to build up a global gure, or to bring items together. Once all the items of interest have been imported from the reserve, through browsing or searching, and once they have been arranged in the gallery space, the user has a rst disposition he can play with. When he will browse the gallery space, his experience will be inuenced by the fact that certain objects are close in space, and in time of visitation. Although this is interesting in itself, the system can help the user go further, by dening a set of guided visits, which are simply an order of visitation of selected objects in the gallery. The type of interface we have chosen to implement these functionalities is a 2D zoomable user interface (ZUI), inspired by Ken Perlin's Pad [?]. All objects are in the same 2D space, which has no borders. The point of view can be moved vertically and horizontally, and the user can zoom in and out. If he zooms in on an item, until it lls the screen, the sound is played back. This kind of 6 ReCollection: Sustainable Collection Making interface has been experimented; it has obtained good results, and has been proven reliable[?]. Its intuitive approach is seducing to us, particularly in our goal of intuitively collecting digital media. Finally, the spatial metaphor takes advantage of the users' spatial memory and cognitive abilities [?,?]. Fig. 1. The Gallery 3 Conclusion Husserl used to say that consciousness is always consciousness of something, that consciousness always pre-dates the subject and the object, and puts them together in the process. There are no subjects or objects already existing independently that meet in the world to ll out a journal of experiences (the subject) and perhaps adapt to each other by induction. In the same fashion, we could say that a collection is always a collection of something, in that the original process of categorization is the activity of collecting, implacably mixing abstraction and spatio-temporal arrangements, and producing as many metastable categories. The current models for information search are too formal, and they assume that the function and variables dening the categorization are known in advance. In practice, however, when searching for information, experimentation plays a good part in the activity, not due to technological limits, but because the searcher does not know all the parameters of the class he wants to create. He has hints, but these evolve as he sees the results of his search. The procedure is dynamic, but not totally random, and this is where the collection metaphor is interesting. The collector's experimentation is always carried out by placing objects in temporary and metastable space/time. Here, the intension of the future cate- gory has an extensive gure in space/time. And this system of extension (the ReCollection: Sustainable Collection Making 7 gure) gives as many ideas as it does constraints. What is remarkable is that when we collect something, we always have the choice between two systems of constraints, irreducible one to the other. This articial indierentiation for sim- ilarity/contiguity is the only possible kind of freedom allowing us to categorize by experimentation. Our prototype implements these ideas by allowing the user to dispose his objects in 2D space. This arrangement may be manual, automated or both; it may be based on similarity, spatial proximity or both. A global gure may emerge from this arrangement, inuencing the browsing and also the extension of the collection. Local gures emerge, which are the temporary pseudo-classes illustrating the pre-categorization building process of collecting. The art gallery metaphor ts very well, as it adds further meaning to the arrangement of the collected items in space, and models the excess in collections thanks to the reserve. Through exploiting space in this way, the software interface takes advantage of our cognitive abilities in dealing with spatial information, and also our ability to collect information and acquire knowledge. Our next step is experimentation in order to validate our work. This could simply take the form of a series of sessions in which both novice and experimented users are asked to build up collections using the software. Through user-feedback, we will have a rst idea of how well the interface is understood, how useful the users nd it and how easy it is to use. If this experiment is a success, as we believe it will be, we will continue our research and bring it to the next level. Through integrating new functionality focused on indierentiation for similarity/proximity, we will be able to build specic tools for a variety of applications in which the user's activity may be - at least metaphorically - described as building a gural collection. References 1. François Pachet: Content Management for Electronic Music Distribution: The Real Issues. Communications of the ACM, (April 2003). 2. Pachet, F., Aucouturier, J.-J., La Burthe, A., Zils, A. and Beurive, A.: The Cuidado Music Browser : an end-to-end Electronic Music Distribution System. Multimedia Tools and Applications. Special Issue on the CBMI03 Conference (2006) 3. N. Boujemaa and C. Nastar: Content-based image retrieval at the imedia group of the inria. 10th DELOS Workshop Audio-Visual Digital Libraries Santorini (1999) 4. Gérard Wajcman: Collection, Nous (1999) 5. Schwarz D. Beller G. Verbrugghe B. Britton S.: Real-time corpus-based concatena- tive synthesis with catart. DAFx (2006) 6. Fox D., Perlin K.: Pad: An alternative approach to the computer interface. Proc. ACM SIGRAPH'93 (1993) 7. Guiard Y. Bourgeois F. Mottet D. Beaudoin-Lafon M.: Beyond the 10-bit barrier: Fitts' law in multiscale electronic worlds. Proc. IHM-HCI 2001, Springer-Verlag (2001) 8. Seegmiller D. Mandler J.M. and Day J.: On the coding of spatial information. Mem- ory and Cognition (1977) 8 ReCollection: Sustainable Collection Making 9. Hasher L. and Zacks R.T.: Automatic and eortful processes in memory. Journal of Experimental Psychology (1979). 10. Jean-François Perrot: Objets, classes et héritage: dénitions. dans 'Langages et modèles à objets Etat des recherches et perspectives'. collection Didactique, INRIA, pages 3-31 (1998) 11. Michalewicz, Z.: Gilles-Gaston Granger: Formes, opérations, objets. VRIN (1994) 12. Jean Baudrillard: The System of Objects. Verso (2005) 13. François Pachet: Les nouveaux enjeux de la réication. L'Objet, 10(4) (2004) 14. Xavier Serra: Towards a Roadmap for the Research in Music Technology. ICMC 2005. Barcelona (September 2005) 15. Francis Rousseaux: La collection, un lieu privilégié pour penser en- semble singularité et synthèse. Revue électronique Espaces Temps. http://www.espacestemps.net/document1836.html (2005) 16. Walter Benjamin: Paris, capitale du XIXe siècle - le livre des passages. Le Cerf (1989) 17. Krzysztof Pomian: Collectionneurs, amateurs et curieux. Gallimard (1987) 18. Sylvie Tourangeau: Collection création, parcours désordonné, propos d'artistes sur la collection. http://collections.ic.gc.ca/parcours/laboratoire/livre/creation.html 19. Jean Piaget, Bärbel Inhelder: La genèse des structures logiques élémentaires. Delachaux et Niestlé (1980) 20. Roger Pédauque: http://rtp-doc.enssib.fr/rubrique.php3?id_rubrique=13 21. Francis Rousseaux: Singularités à l'oeuvre. Collection Eidétique, Delatour (2006) 22. Alain Bonardi: New Approaches of Theatre and Opera Directly Inspired by Inter- active Data-mining. Sound & Music Computing Conference (SMC'04). pages 1-4, Paris (20-22 October 2004) 23. Patrick Brézillon: Context in Human-Machine Problem Solving: a Survey. Knowl- edge Engineering Review, 14, 1-34 (1999) 24. François Pachet: Nom de chiers : LeNom. Revue du groupe de travail STP. Maison des Sciences de l'Homme Paris (2004) 25. Francis Rousseaux: Par delà les Connaissances inventées par les informaticiens: les Collections?. Intellectica, 2005/2-3, n° 41-42 (2006) 26. Jean-François Peyret: Trouver le temps, colloque Ecritures du Temps et de l'Interaction. http://resonances2006.ircam.fr/?bio=57, Ircam (June 2006)