=Paper=
{{Paper
|id=Vol-1436/Paper22
|storemode=property
|title=UNED-UV @ Retrieving Diverse Social Images Task
|pdfUrl=https://ceur-ws.org/Vol-1436/Paper22.pdf
|volume=Vol-1436
|dblpUrl=https://dblp.org/rec/conf/mediaeval/CastellanosBGVC15
}}
==UNED-UV @ Retrieving Diverse Social Images Task==
UNED-UV @ Retrieving Diverse Social Images Task A.Castellanos X. Benavent A. García-Serrano NLP & IR Group, UNED Department of Informatics, NLP & IR Group, UNED C/ Juan del Rosal,16 University of Valencia C/ Juan del Rosal,16 Madrid, Spain Valencia, Spain Madrid, Spain acastellanos@lsi.uned.es xaro.benavent@uv.es agarcia@lsi.uned.es E. de Ves J.Cigarrán Department of Informatics, NLP & IR Group, UNED University of Valencia C/ Juan del Rosal,16 Valencia, Spain Madrid, Spain esther.deves@uv.es juanci@lsi.uned.es ABSTRACT model [8]. This paper details the participation of the UNED-UV group at the 2015 Retrieving Diverse Social Images Task. This 2. SYSTEM DESCRIPTION year, our proposal is based on a multi-modal approach that Our multi-media system has two sub-systems: a Textual- firstly applies a textual algorithm based on Formal Concept Based Information Retrieval system that works with tex- Analysis (FCA) and Hierarchical Agglomerative Clustering tual information and generates clusters for diversity, and a (HAC) to detect the latent topics addressed by the images to Content-Based Information Retrieval sub-system that esti- diversify them according to these topics. Secondly, a Local mates the relevance of each of the images of the generated Logistic Regression model, which uses the low level features clusters. and some relevant and non-relevant samples, is adjusted and estimates the relevance probability for all the images in the 2.1 Textual-Based Information Retrieval database. Our proposal is based on the discovering of the latent topics addressed by the images by applying Formal Concept 1. INTRODUCTION Analysis. A Hierarchical Agglomerative Clustering (HAC) Information retrieval systems have been commonly based [9] is then applied to group together similar images according on maximizing the relevance of the result list (i.e., in terms to the detected formal concepts (those belonging to the same of accuracy-based metrics). However, retrieval systems, and topic). Each HAC-based cluster may be considered as an specially those focused on image diversification, should be image set covering a similar topic. Then, for each cluster, able to offer relevant but also diverse results. Users are not the visual features related to the images are applied to rank only interested in accurate results but also in results covering the cluster images according to their visual diversity. different topics or situations [1]. To address this task we propose an image representation 2.1.1 FCA-based Modelling using the concept/s covered by the textual information re- Formal Concept Analysis (FCA) is a theory of concept lated to the images. This conceptual representation is tack- formation [11] to organize formal contexts. A formal context led by means of the use of Formal Concept Analysis, a data is a structure K := (G, M, I), where G is a set of objects, organization technique. In our participation in the 2014 edi- M a set of attributes related to these objects and I a binary tion of this task we proved that this approach was able to relationship between G and M , denoted by gIm: the object identify the different topics addressed in the images, allow- g has the attribute m. From the formal context, a set of ing the diversification of the result list according to them formal concepts can be inferred i.e., a formal concept is a [4]. pair (A, B) of images A and the features shared by those This year we intend to go a step further by presenting images B) and organized in a lattice from the most generic a multimedia approach so that the aspects related to the to the most specific one. visual information of the images were missed in our previous By applying FCA the images in the test set are modelling approach. As it has extensivily been proven that the visual terms of formal concepts, which group together the images information has a great impact in the information retrieval sharing a same set of features. In order to select only those systems [10]. The visual approach presented uses a relevance most-representative features, we applied Kullback-Leibler feedback algorithm developped by the UV group used in Divergence (KLD) [7] on the textual contents related to the previous works [3], [5]. This method estimates the similarity images. This KLD-based selection represents each image by probability of all the images of the database using the visual the textual contents that better differentiates a image from low-level features by means of a Local Logistic Regression the other ones. Copyright is held by the author/owner(s). 2.1.2 HAC-based grouping MediaEval 2015 Workshop, Sept. 14-15, 2015, Wurzen, Germany. This work has been partially supported by VOXPOPULI (TIN2013-47090- From the FCA formal concepts, a set of diverse image C3-1-P) and DPI2013- 47279-C2-1-R Spanish projects. groups is created by applying a HAC algorithm [9]. Specif- ically, we propose a Single Linking hierarchical clustering that groups together similar formal concepts and the Zero- Table 1: Official Metrics for Retrieving Diverse So- Induces index to set the cluster similarity [2]. cial Images Task. Best result for each topic set is in bold, and best result for the automatic runs is in 2.2 Content-Based Information Retrieval italics. First block results are for the one-topic sub- set, the second for the multi-topic subset, and the This sub-system is concerned with Content-Based Image third for the overall set. retrieval which models the user preferences by using a rel- Set run1 run2 run3 run5 evance feedback algorithm. The general methodology in- P@20 0.6362 0.7094 0.7051 0.7645 volves five steps: One-topic CR@20 0.3704 0.4082 0.3995 0.4194 1. Reduction of the data dimensionality: The provided F1@20 0.4618 0.5068 0.4988 0.5240 low-level visual features [6] are used to generate a fea- P@20 0.7300 0.6393 0.7207 0.7886 ture vector associated to each image that will be gener- Multi-topic CR@20 0.4257 0.4407 0.4116 0.4491 ically denoted as x in a dimensional space N = 945. F1@20 0.5130 0.5025 0.5001 0.5519 These features are reduced using a Principal Compo- P@20 0.6835 0.6741 0.7129 0.7766 nent Analysis (PCA). We retain only the first compo- Overall CR@20 0.3983 0.4246 0.4056 0.4344 nents that account for 80% of the data variability. We F1@20 0.4876 0.5046 0.4994 0.5380 have used this idea to reduce the original dimension of our characteristic space in a new characteristic vector of dimension M < N . One of the advantages of this re- image at each of the clusters generated by the TBIR duction is that the new transformed components are in sub-system (run3 ). If there are less than 50 clusters, decreasing order with respect to the variance explained a second highest probability image selection is done. by the corresponding principal component. For the automatic run using only visual information 2. Selecting the relevant and non-relevant sets: The user (run1 ), the clusters are made by a k-means (k = 50) looks a few screens, each showing some images, and procedure over the PCA components of the visual fea- marks some of them as being relevant and non-relevant ture vector. (run5 ). For the automatic runs (run1 and run3 ), the relevant and non-relevant images are automatically se- lected. For the one-topic subset, the relevant images 3. RESULTS are the images given by Wikipedia for the certain topic, We submitted four runs computed as following: run1 - and for the multi-topic subset, we have generated the automated using visual information only (uses step 2 pre- relevant images by selecting the first five results. A set sented in Section 2.2), run2 - automated using text infor- of non-relevant images has been manually generated mation only (uses step 1 presented in Section 2.1), run3 - taking into account the non-relevant guidelines given automated multimedia (uses steps 1-2 presented in Sections [6] (photos with regular people as main subject, pho- 2.1 and 2.2) and run5 - everything allowed: Textual clusters tos with riots and protests). At each query, the non- witch FCA and manual relevance feedback algorithm using relevant images required are randomly selected from visual features (uses steps 1-2 presented in Sections 2.1 and the generated non-relevant set. 2.2). Results are presented in Table 1. It is interesting to observe that our best results for both 3. Parameter estimation of the Local Logistic Regression precision and diversification are obtained with the multi- Models [8]: The reduced feature vectors (PCA) and media human-based approach, run5, F @20 = 0.5380, for the relevant and non-relevant sets are the inputs of both subsets: one-topic, F @20 = 0.5240, and multi-topic, several Local Logistic Regression models whose out- F @20 = 0.5519. For the automatic runs, the best result is puts are the probabilities for user assessment, i.e. the achieved by run2 at the one-topic subsest, F @20 = 0.5068; probabilities he/she would assign to the fact that the whereas for the multi-topic subset, run1 gets the highest image belongs to the relevant set. The feature vector precision P @20 = 0.7300 and best performance F 1@20 = is splitted dynamically in m groups of non-fixed size. 0.5130, but run2 gets better diversification, CR@20 = 0.4407. Each group is used for adjusting the model of higher order, given the inputs sets and PCA components. 4. Ranking of the database: Models are evaluated on all 4. CONCLUSIONS the images of the database and return the probabil- We presented a multimodal approach for image diversi- ities of being relevant for each estimated model; as fication applying a conceptual-based modelling (based on results, we have a probability vector (p) of dimension FCA and HAC) to cluster the images according to the la- m for each individual image. We combine these prob- tent topics addressed by their textual content, joined with abilities in just one by using a weighted average. The a relevance feedback algorithm using the visual features for weights (w) for a given probability are obtained by the determining the similarity. Results show that the manual amount of variance accounted for the group of compo- version of the multimedia approach works better than the nents used to adjust the model. Finally, this proce- automatic one. This is due to the way the relevant and non- dure gives us a score/probability for each image in the relevant images are chosen to estimate the model. A human database. knows better the meaning of the topic; therefore, he/she selects the most significant images for the model. Our chal- 5. Ranking of the database: the final diversity similarity lenge is to make the automatic approach to be able to select rank is generated by selecting the highest probability the relevant and non-relevant images as a human being. 5. REFERENCES [1] R. Agrawal, S. Gollapudi, A. Halverson, and S. Ieong. Diversifying search results. In Proceedings of the Second ACM International Conference on Web Search and Data Mining, pages 5–14. ACM, 2009. [2] F. Alqadah and R. Bhatnagar. Similarity measures in formal concept analysis. Annals of Mathematics and Artificial Intelligence, 61(3):245–256, 2011. [3] X. Benavent, A. Garcia-Serrano, R. Granados, J. Benavent, and E. de Ves. Multimedia information retrieval based on late semantic fusion approaches: Experiments on a wikipedia image collection. Multimedia, IEEE Transactions on, PP(99):1–1, 2013. [4] A. Castellanos, J. Cigarrán, and A. Garcı́a-Serrano. Uned @ retrieving diverse social images task. In MediaEval Multimedia Benchmark Workshop, CEUR-WS.org, 1263, ISSN 1613-0073, 2014. [5] E. de Ves, G. Ayala, X. Benavent, J. Domingo, and E. Dura. Modeling user preferences in content-based image retrieval: a novel attempt to bridge the semantic gap. Neurocomputing, (0):–, 2015. [6] B. Ionescu, A. L. Gı̂nsca, B. Boteanu, A. Popescu, M. Lupu, and H. Müller. Retrieving diverse social images at mediaeval 2015: Challenge, dataset and evaluation. In Retrieving Diverse Social Images at MediaEval 2015: Challenge, Dataset and Evaluation. Working Notes Proceedings of the MediaEval 2015 Workshop, 2015. [7] S. Kullback and R. A. Leibler. On information and sufficiency. The Annals of Mathematical Statistics, 22(1):79–86, 1951. [8] C. Loader. Local regression and likelihood. New York: Springer-Verlag, 1999. [9] C. D. Manning, P. Raghavan, and H. Schütze. Hierarchical clustering. pages 377–403. 2008. [10] S. Rudinac, A. Hanjalic, and M. Larson. Generating visual summaries of geographic areas using community-contributed images. IEEE Transactions on Multimedia, 15(4):921–932, 2013. [11] R. Wille. Concept lattices and conceptual knowledge systems. Computers & mathematics with applications, 23(6):493–515, 1992.