Social approach to context-aware retrieval Luca Vassena University of Udine via delle Scienze, 206 Udine, Italy vassena@dimi.uniud.it ABSTRACT data are exploited to capture the dynamic nature of the user In this paper we present a general purpose solution to Web needs, of the information available, and of the relevance of content perusal by means of mobile devices, named Social this information, typical of a mobile user in the real world. Context-Aware Browser. This is a novel approach for the This approach is named Social Context-Aware Browser and information access based on the users’ context, whose aim is its novelty is threefold. First of all this is a new radical ap- to retrieve what the user needs, even if she did not issue any proach that aims at discovering “the query behind the con- query. Our solution is built upon a social model that exploits text”: to retrieve what the user needs, even if she did not the collaborative efforts of the whole community of users issue any query [7]. Second this is not a domain depen- to control and manage contextual knowledge, related both dent application, but a new generic way of interaction and to situations and resources. This paper presents a general information access, able to adapt to every domain. Third, survey of our solution, describing the idea and presenting an as current models for context-awareness are too limited for implementation approach. very general applications, this approach brings new models built upon the social dynamics at the basis of Web 2.0. This paper is structured as follows. We first briefly sur- Categories and Subject Descriptors vey related work (Section 2), presenting the Context-Aware H.3.3 [Information Storage and Retrieval]: Information Retrieval field and introducing the main ideas behind Web Search and Retrieval 2.0. We then describe our solution (Section 3), presenting a general survey, the main ideas, and an implementation Keywords approach. In Section 4 we present a brief discussion and fi- nally we draw some conclusions and we present future work Context-aware retrieval, mobile search, social, folksonomy, (Section 5). Web 2.0 1. INTRODUCTION 2. RELATED WORK Context-aware computing is a computational paradigm that has faced a rapid growth in the last few years, espe- 2.1 Context-Aware Retrieval cially in the field of mobile devices. A key-role in this new Context-Aware Retrieval (CAR) is an extension of clas- approach is played by the notion of context, that is roughly sical Information Retrieval (IR) that incorporates the con- described as the situation the user is in. This concept en- textual information into the retrieval process, with the aim closes important information that could be used to affect the of delivering information to the users that is relevant within capabilities of mobile devices, adapting them to the user’s their current context [4]. CAR systems are concerned with needs. In particular, contextual data can be used to pre- the acquisition of context, its understanding, and the appli- dict the user needs and to seek and retrieve information, cation of behaviour based on the recognized context [11]. thereby reducing the complexity of the user-device interac- Typical CAR applications present the following character- tion and providing the right information in the right place istics [4]: a mobile user, i.e., a user whose context is chang- at the right time. From this point of view, because of the ing; interactive or automatic actions, if there is no need to huge amount of contextual information and its heterogene- consult the user; time dependency, since the context may ity and uncertainty, the mobile and context-aware comput- change; appropriateness and safety to disturb the user. Al- ing environments represent a new challenge for Information though CAR applications can be both interactive and proac- Retrieval (IR). The combination of IR and context-aware tive in their communication with the user, we concentrate computing has been named context-aware retrieval [4]. on the proactive aspects, since they are more relevant to These considerations guided us towards a new approach our proposal. Besides, we concentrate on the association to Web contents production and fruition, where contextual between CAR and mobile application, as they can be con- sidered as the prime field for CAR [4]. An example of CAR system is the Ubiquitous Web [5], a solution based on the spontaneous annotation by a commu- nity of users of objects, places, and other people with Web Appears in the Proceedings of the 1st Italian Information Retrieval accessible content and services. A more general system is Workshop (IIR’10), January 27–28, 2010, Padova, Italy. represented by the MoBe framework [7]. In this applica- http://ims.dei.unipd.it/websites/iir10/index.html Copyright owned by the authors. tion, a general inferential framework (based on ontologies rative efforts of the community of users. The community, in and Bayesian networks) combines the information coming fact, is encouraged to define the contexts of interest, share, from sensors to infer new and more abstract contexts (user use and discuss them, associate context to content (web activities, needs, etc.), that are used to retrieve and execute pages, applications, etc.), to have a dynamic and more user- the most relevant applications. tailored context representation and to enhance the process of retrieval based on users’ actual situation. 2.2 Web 2.0, the social web In particular users can freely interact with resources and With Web 2.0 [9] and social software we represent all web- can define that a resource is useful (or not adapt) to their based services with “an architecture of participation”, that current context, can associate resources to particular con- is, an architecture featuring a high interaction level among texts, can explicitly define the context their are in, and fi- users and allowing users to generate, share, and take care nally can browse resources relevant for their current context. of the content. In the plenty of tools provided by Web 2.0, we are mainly focusing on social bookmarking and folk- 3.2 Model sonomies. Social bookmarking is a method for organizing, search- 3.2.1 Context representation ing, and managing documents of interest among users. In We represent the context as a folksonomy. Each tag is a social bookmarking system, users save links to documents banally a keyword or string of text and represents a single of interest in order to remember or share them with the contextual value [8]. We divide the contextual tags into two community. Social bookmarking is strictly related with the categories: concept of folksonomy, that is the practice of annotating • Concrete tags: represent the information obtained by and categorizing content in a collaborative way, by means a set of sensors. These information can be read from of informal tags. Folksonomies, that is a portmanteau of the surrounding environment through physical sensors folk and taxonomy, allow users to easyly and informally de- (e.g., temperature sensor), or can be obtained by other scrive documents and content. This represents a powerful software (e.g., calendar) through logical sensors. Con- combination that has gained popularity as it allows a more crete tags that directly refers to sensors values are rep- natural and simpler management of the knowledge. The use resented using the triple tags notation that are tags of freely choosen categorizations and the collaborative as- that uses a particular syntax (namespace:predicate=value) pect in fact allow also non-expert users to classify and find to define extra information. information. Folksonomies and social bookmarking for ex- For example, geo:longitude=12.456 is tag for the ge- ample are used in well-known Web 2.0 systems like Flickr1 , ographical longitude coordinate whose value is 12.456. Youtube2 , Del.icio.us3 , etc. Other concrete tags, can be automatically obatined by Folksonomies however are criticized because the lack of the sensed values (e.g. afternoon, summer, ...). terminological control could lead to unreliable and inconsis- tent results [3]. • Abstract tags: represent the high level contextual in- formation that are freely associated by the users to 3. SOCIAL CONTEXT-AWARE BROWSER the concrete contexts, in order to detail their context description. Some examples are: home, shopping, etc. 3.1 Description The difference between the two categories is faded since the The Social Context Aware Browser (sCAB for short) [12] contexts cannot be unambiguously assigned to one or the is a general purpose solution to Web content navigation by other category. However this partition is helpful in order means of context-aware mobile devices. It allows a “physical to distinguish the low level information coming from sen- browsing”: browsing the digital world based on the situa- sors and the high level contextual information intoduced by tions in the real world. The main idea behind sCAB is to users. empower a generic mobile device with a browser able to au- The user context is a “cloud” composed by an undefined tomatically and dynamically retrieve and load Web pages, number of concrete and abstract tags (Figure 1). services, and applications according to the user’s current context. The sCAB acquires information related to the user and the surrounding environment, by means of sensors installed on the device or through external servers. This information, combined with the user’s personal history and the commu- nity behaviour, is exploited to infer the user’s current con- text (and its likelihood). In the subsequent retrieval process, a query is automatically built and sent to an external search engine, in order to find the most suitable Web pages for the sensed context and present them to the user. As current models for context-awareness are too limited for very general applications like the sCAB, this approach brings new social models for CAR that exploit the collabo- 1 www.flickr.com 2 www.youtube.com Figure 1: User’s current context. 3 www.del.icio.us.com 3.2.2 Operations In the sCAB conceptual model [12] there are six main operations. The first two are performed automatically and continuosly by the system. With the inference operation (Figure 2), starting from the concrete tags sensed by sensors, the most relevant abstract tags are retrieved and become part of the user’s context representation. Then with the retrieval operation (Figure 2), starting from the set of all the tags in the user’s current context, the most relevant resources are retrieved. For example, starting from the GPS coordinates, the system enhance the user’s context with the abstract tags “walk out park dog”; then starting from all the tags, the system retrieves resources relevant to the given context, as Web pages that teaches how to train dogs, etc. Figure 3: Definition and annotation operations. weight the operations she performs, while the scores of con- textual tags and resources define their quality and relevance. If a resource annotated with contextual information is never used in that context, the related score decreases and more relevant resources will stand out. 3.3 Implementation approach Concrete and abstract tags, and resources are the main elements in our implementation model. Concrete tags, as Figure 2: Inference and retrieval operations. output of sensors, are exploited to retrieve the most relevant abstract tags, and in the same way all the tags are exploited The other four operations are strictly related to the user to retrieve the most relevant resources. interaction: the main two are definition and annotation In the following sections we show an implementation pro- (Figure 3). The definition is used to manage the contextual posal and how the different operations in the model have information and it is performed when a user directly define effect on the system, from a low level point of view. her context, or when she provides contextual tags during the annotation of a resource. In particular, this operations man- 3.3.1 Indexes ages the associations between concrete and abstract tags, We exploit two indexes. In the first one, called contexts and the strength of their relationships. The annotation on index, abstract tags are indexed over concrete tags, while in the contrary is used to manage the association between con- the second one, called resources index, resources are indexed textual tags and resources and it is performed when the users over the set of all tags (both concrete and abstract). The link resources to particular contexts. We can imagine a user proposed approach is community based, thus the indexes at a park with her dog: she wants to associate to her context and the inferential system are managed by remote servers a particular Web page teaching dog training. For this reason and not stored on the mobile device. Since the approach is she bookmarks that resource with the contextual tags “out similar for both the indexes, we are going to show just the dog park sunny train”. Doing so, first the added abstract first one. tags are related to the sensed concrete tags and for all the The contexts index is a matrix that describes the fre- users with a similar concrete tag cloud, these abstract tags quency of abstract tags over the concrete ones. Each column (or part of them) can become part of the their context rep- corresponds to a concrete tag, and each row corresponds to resentation. Second, that particular Web page is enhanced an abstract tag. Each entry in the matrix has three values with all the tags, and it will be automatically proposed to (Figure 4): users every time they will be in a similar context. As the users are the main actors in the process of context • Uij : represents the user that has associated the ab- definition and resource annotation, problems related to the stract tag i to the concrete tag j first; quality of context and resources are likely to appear. To • Sij : a score that defines how relevant the abstract tag cope with this problem we propose the adoption of a social i is for the concrete tag j. This value is in the interval evaluation/reputation mechanism. We exploit the ideas pre- [0, 1]; sented in [6]: every element in the model (users, contexts, resources) has a score that increases or decreases based on • σij : steadiness value that defines how steady is the the community behavior. The score of each user is used to association between the abstract tag i and the concrete tag j. values in the resources index with the annotation operation the approach is similar) : c1 c2 ... • σij (ti+1 ) = σij (ti ) + SUc (ti ) × β a1 a2 (U22 , S22 , σ22 ) σij (ti ) × Sij (ti ) ± SUc (ti ) × β • v= .. σij (ti+1 ) .  v if v > 0 • Sij (ti+1 ) = Figure 4: Contexts index example 0 otherwise where ti represents a discrete time instant and ti+1 the sub- Intuitively, since not all the abstract tags can be related sequent time instant. to all concrete tags, the proposed index will be a very sparse While the score is a value in the interval [0, 1], the steadi- matrix. At the same time, because of the very high number ness is an always increasing value. The higher the steadiness of both concrete and abstract tags, the index can assume of an association is, the more stable the association is, and very huge dimensions. However a lot of research is being then the lesser effect each update operation will have. The performed on indexes designing and analysis, also in the user’s score is exploited for the update of the values in the CAR field [2]. The related discussion is out of the scope this index. It can both increase an association, or decrease it work. (e.g. a user removes a tags from his context). The higher 3.3.2 Users’ score the user’s score is, the more effective the update operation will be. This means that good users have more influence on In our approach two values are associated to each user the system than bad users. Finally, β is a parameter greater and they define the goodness of the user in working with than 0 and it is used to weight the user score: operation per- contextual information: formed explicitly by users (inclusion or removal of abstract • SUc : a score that defines how good the user is in asso- tags) have more effect than implicit update performed au- ciating concrete tags to abstract tags; tomatically based on the interaction of the community with the resources. • SUr : a score that defines how good the user is in asso- ciating resources to contexts; 3.3.4 Inference and retrieval As previously, we are concentrating only on the management The inference and retrieval operations works respectively of values related to concrete and abstract tags, since the on the first and second index, but they are similar, thus in approach is exactly the same working at the higher level of the following we are explaining just the inference one. tags and resources. The approach is the following: Every time a new relation between abstract and concrete 1. starting from the concrete tags in input, we consider tags is created with a definition (“filling a hole” in the index), only the set of abstract tags that have been associated the user who performed the operation is associated to that at least with one of the concrete tags; relation. Then on the basis of how the community inter- acts with those contextual information, the user’s score will 2. for each abstract tag we compute a rank value, to de- be update. It is calculated as follows: for each association fine an order of relevance for the abstract tags; among tags ij performed by the user U , SUc corresponds to σij 3. in order to limit the number of retrievd tags, we re- the mean of the products × Sij , where σmax is the σmax trieve the abstract tags whose rank value is higher than max steadiness value in the index. the mean of all rank values. New associations have a low steadiness value, thus their score, as their have not steadied yet, will have low influence The rank value is computed following an adapted version on the user’s score. Good associations will have high score of the tf.idf weighting scheme. In particular for each consid- and steadiness values, and they will reflect on high users’ ered abstract tag ai we have: score. In the same way, low users’ scores are due to bad P associations between contextual tags. Since Sij ∈ [0, 1], also • A = cj σij × Sij , for each sensed concrete tag cj SUc ∈ [0, 1]. |C| In this approach, for simplicity, only new associations be- • B = , where |C| is the total number of tween tags are considered for the computation of the users’ |{c : ai ∈ c}| score. An extension could consider all the existing associa- sensed concrete tags, and |{c : ai ∈ c}| is the number tions. In this way a user is “good” because she defines good of concrete tags to which the abstract tag ai has been new associations and because she exploits existing good as- associated; sociation. • rank value = Aα × Bβ, where α, β are parameters 3.3.3 Values update exploited to weight the different values. The proposed indexes are not static, but the values related Some considerations can be drawn. First, more are the to the association between concrete and abstract tags and concrete tags in the current context to which an abstract resources are continuosly updated, based on the interaction tag is associated, the higher will be its rank value. Second, of users with resources in context. abstract tags with high score and steadiness will have an With every definition operation the values in the contexts higher rank value. Third, abstract tags related to particular index are updated according to the following system (for the sets of concrete tags will have an higher rank value than very general ones that are associated to an high number of will proceed hand in hand. As first step we want to exploit concrete tags (high frequency). benchmarks to evaluate detailed implementation solutions, In addition, starting from this basic approach, we can en- like, for example, different algorithms to assess the relevance hance the rank value computation exploiting other informa- of tags for situations and resources. After that, we plan to tion. For example a reasonable idea is to weight the tags apply an IIR evaluation methodology, involving users in a based on their age in the user’s context representation, giv- controlled environments, following the ideas presented [1, ing more importance to the newest tag. In this we enhance 10]. Finally a broader user-centred evaluation will help us the importance of new contexts. to understand if the sCAB is effective in the real world. 4. DISCUSSION Acknowledgements Although the conceptual ideas are clear, the implementa- The authors acknowledge the financial support of the Ital- tion approach we propose is in an initial stage of definition. ian Ministry of Education, University and Research (MIUR) We suggested a possible solution, but several are the ways within the FIRB project number RBIN04M8S8, and the re- to refine it and several are the algorithms to be exploited. gion Friuli Venezia Giulia. This research has been partially For this reason the evaluation hold an important role in our supported by MoBe Ltd. (www.mobe.it), an academic spin- work: since different alternative solution exist, it is impor- off company specializing in software for mobile devices. tant to evaluate them and compare their effectiveness. Even if the knowledge related to the whole community is 6. REFERENCES exploited to infer and refine the current context of single [1] P. Borlund. The IIR evaluation model: a framework users, the proposed model differentiates the personal from for evaluation of interactive information retrieval the community level, giving more importance to the first systems. Information Research, 8(3):8–3, 2003. one. For example if a user annotates a situation as “play”, [2] A. Göker, S. Watt, H. I. Myrhaug, N. Whitehead, she is considered to be in “play” context, even if most people M. Yakici, R. Bierig, S. K. Nuti, and H. Cumming. An annotate the same situation as “work”. On the contrary, if ambient, personalised, and context-sensitive a user is for the first time in a situation (e.g. location never information system for mobile users. In EUSAI ’04: visited), her context is refined just with the information from Proceedings of the 2nd European Union symposium on the community. Considering the previous example, as most Ambient intelligence, pages 19–24. ACM, 2004. people annotate the situation with “work”, the user is con- [3] S. A. Golder and B. A. Huberman. The structure of sidered to be in “work” context. collaborative tagging systems. Arxiv preprint In the last case, the assumption performed by the system cs.DL/0508082, 2005. in order to provide the user with relevant resources could be [4] G. J. F. Jones and P. J. Brown. Context-aware wrong. However this is not a problem. Since we are working retrieval for ubiquitous computing environments. In with people, it will be hardly possible to provide results that Mobile HCI Workshop on Mobile and Ubiquitous totally satisfy each user, due the intrinsic difference of views Information Access, volume 2954, pages 227–243. and needs in a community. Rather our solution aims at and Springer LNCS, 2004. averagely good behavior. [5] D. Lopez de Ipiña, J. I. Vazquez, and J. Abaitua. A Talking about the indexes, we have seen how the related context-aware mobile mash-up plaftorm for ubiquitous information are changed dynamically based on community web. In Proc. of 3rd IET Intl. Conf. on Intelligent interaction. However this is not the only possible approach. Environments, pages 116–123, 2007. We can imagine complementary approaches that can sup- [6] S. Mizzaro. Quality control in scholarly publishing: A port the community statistical one. For example, we could new proposal. J. of the Am. Soc. for Information use some geographic gazetteer for associating geonames to Science and Technology, 54(11):989–1005, 2003. geographic coordinates provided from the concrete tags, so [7] S. Mizzaro, E. Nazzi, and L. Vassena. Retrieval of as to reinforce the rank of associated abstract tags that con- context-aware applications on mobile devices: how to tain the same geographic names or names of close locali- evaluate? In Proc. of Information Interaction in ties. The geonames could be useful also for retrieving more Context (IIiX ’08), pages 65–71, 2008. relevant resources, those containing the geonames ore close [8] S. Mizzaro, E. Nazzi, and L. Vassena. Collaborative geonames. annotation for context-aware retrieval. In ESAIR ’09: Proceedings of the WSDM ’09 Workshop on Exploiting 5. CONCLUSIONS Semantic Annotations in Information Retrieval, pages In this paper we have presented the Social Context-Aware 42–45. ACM, 2009. Browser, a general purpose solution to Web content perusal [9] T. O’Reilly. What is web 2.0, design patterns and by means of mobile devices. The sCAB is a novel approach business models for the next generation of software, for the information access based on context, where the com- 2005. munity of users is called to manage the contextual knowl- [10] D. Petrelli. On the role of user-centred evaluation in edge, both related to situations and resources, through col- the advancement of interactive information retrieval. laboration and participation. In particular we presented a Inf. Process. Manage., 44(1):22–38, 2008. general survey, the main ideas, and an implementation ap- [11] A. Schmidt. Ubiquitous Computing - Computing in proach. Context. PhD thesis, Lancaster University, 2003. As future work we aim at implementing a prototype of the [12] L. Vassena. Context-aware retrieval going social. In proposed system, and, in particular, we suggest a multistage 3rd Symposium on Future Directions in Information approach, where implementation and evaluation processes Access (FDIA)., 2009.