Clustering and Ranking of Image Search Results c Elena Sivogolovko ∗ University of Saint-Petersburg efecca@gmail.com Abstract other CBIR systems, for example F-I-T[6] and ImageR- ank[7], but these systems do not consider information In a typical content-based image retrieval about image features meaning, which can be obtained (CBIR) system, query result is a set of images during clustering process. However, we suppose that sorted by feature similarities with respect to the such kind of information can be usefull for search result query. We introduce a new approach to CBIR representation. result representation. We propose that CBIR system should retrieve image clusters, which el- 2.1 Bipartite graph model ements should be sorted by the most meaningful Graph model is widely used in different clustering tasks, feature similarities. Actually, this paper does such as sparse matrix partitioning, circuit partitioning, not present a full approach but rather work in image segmentation and document clustering. It can also progress. be used in image clustering. Qiu[4] used the undirected bipartite graph(Figure 1) to represent the relationship be- 1 Introduction tween images and their low-level features. This graph is constructed as following. Assuming each image in the In image search, good organization of the search results database is represented by an m-bin colour histogram and is as important as the search accuracy. Many existing Hk = (h(c1 ), h(c2 ) . . . h(cm )) denotes the histogram of CBIR search engines return large quantity of search re- the k-th image, where k = 1, 2, . . . n and the value of sults, ranked by their relevance to the given query. Users hk (ci ) indicates an association of the colour ci with the have to go through the list and look for the desired k-th image. Then the bipartite graph can be represented ones. This is a time consuming task since the returned by a triplet G = (I, C, W ), where I = {i1 , i2 , · · · , in } results usually contain multiple topics and these topics is a set of images, C = {c1 , c2 , · · · , cm } is a set of colors are mixed together. Things become even worse when and W is a set of edges connecting vertices from differ- one topic is dominating but it is not what the user de- ent vertex sets, i.e., W = {< i, j > |i ∈ I, j ∈ C}. The sires. A possible solution to this problem is to cluster weight of edge Wij = hi (cj ) is the j-th colour bin count search results into different semantic groups. In tradi- of the i-th image. tional CBIR area, image clustering techniques are often used to design a convenient user interface, which helps to make more meaningful representations of search results. However, as the images were usually represented by low level visual features, it is hard to get a good clustering result from semantic perspective, because images with high feature similarities to each other may be very differ- ent in terms of semantics. This is known as the semantic gap problem. In this paper, we consider the problem of clustering and ranking image search results. We propose a frame- work to represent an image query result as a clustering Figure 1: Images and features bipartite graph. Squares structure where elements in each cluster are sorted by and circles represent low-level features(colors in our the most meaningful features. case) and images respectively. 2 Related work Note that we can consider this graph as a similarity graph with similarity function S = Wij . The adjacency In this section we will review graph partitioning and matrix of G will be written as: spectral clustering method, which are the foundation of our framework. Such approaches are successfuly used in I C M= I 0 W (1) ∗ This work is partially supported by Russian Foundation for Ba- C WT 0 sic Research under grant 07-07-00268. Proceedings of the Spring Young Researcher’s Colloquium In this context co-clustering of images and features on Database and Information Systems, Moscow, Russia, 2009 can be translated to a graph partitioning problem. 2.2 Graph partitioning and spectral co-clustering 3 Co-clustering and ranking framework The idea of clustering is to separate a set of points into Suppose the dashed line in Figure 1 shows the very different groups according to their similarities. For data partition that minimizes two part cut. After apply- given in a form of a similarity graph this problem can ing clustering algorithm, we will obtain two subsets be restated as follows: we want to find a partition of {i1 , i2 , i3 , i4 , c1 , c2 } and {i5 , i6 , c3 , c4 }. We define such the graph such that the edges between different groups clusters as mixed. Consequently, the low-level fea- have very low weight (which means that points in dif- tures are divided into two feature clusters {c1 , c2 } and ferent clusters are dissimilar from each other) and the {c3 , c4 }, while the images are divided into two image edges within a group have high weight (which means that clusters {i1 , i2 , i3 , i4 } and {i5 , i6 } respectively. The points within the same cluster are similar to each other). most of CBIR systems return obtained image subsets and Given a similarity graph G =< V, E > (in our case do not use feature subsets at all. We propose to sort im- V = I ∪ C and E = W ) with adjacency matrix M , the ages in each image cluster using low-level features from simplest and most direct way to construct a partition is to corresponding feature cluster. We suggest that such kind solve the mincut problem. This consists of choosing the of ranking helps us to improve image retrieval results. partition {A1 , . . . , Ak } which minimizes To sort images by more than one feature we need a data fusion method. In general data fusion is use of k X techniques that combine data from multiple sources and cut(A1 , . . . Ak ) = cut(Ai , Āi ) (2) gather that information in order to achieve inferences, i=1 which will be more efficient and potentially more ac- curate than if they were achieved by means of a single where source. In our case source is a list of images ranked X by one of the selected low-level features. A lot of cut(Ai , Āi ) = mij , (3) commonly used data fusion algorithms such as Comb- i∈A,j∈Āi SUM, CombMNZ are successfuly used in text retrieval. CombMNZ is considered to outperform other data fusion mij is element of adjacency matrix M , which corre- algorithms. It works as follows. Element in the result sponds to edge form i vertex to j vertex, and Āi is the ranked-list gets rank equaled to the sum of all its ranks in complement of a subset A ∈ V . The problem is that fused lists multipied by the number of lists in which this in many cases, the solution of mincut simply consists element exists with non-zero rank: in separating one individual vertex from the rest of the graph. Of course this is not what we want to achieve in X rankresult (i) = ranklist (i) ∗ nz (7) clustering, as clusters should be reasonably large groups fusedlists of points. One way to circumvent this problem is to ex- plicitly request that the sets A1, . . . , Ak are ”reasonably where nz is number of non-zero rank lists. To summa- large”. The two most common objective functions which rize, our algorithm of images and features co-clustering encode this are RatioCut and the normalized cut – Ncut. and ranking can be listed as below. In RatioCut, the size of a subset A of a graph is mea- sured by its number of vertices |A|, while in Ncut the size is measured by the weights of its edges vol(A). Algorithm 1 Co-clustering and ranking algorithm Require: Image set I and the vectors H representing Pk histograms of these images i=1 cut(Ai , Āi ) RatioCut(A1 , . . . Ak ) = (4) 1: Form a bipartite graph G =< I,P C, W > |Ai | n 2: Compute matrix M , D : Dii = k=1 mik and L = Pk D − M. i=1 cut(Ai , Āi ) 3: Compute the first k eigenvectors v1, . . . , vk of the N cut(A1 , . . . Ak ) = (5) generalized eigenproblem Lv = λDv. vol(Ai ) 4: Let V ∈ Rn×k be the matrix containing the vectors In this work we use Ncut as objective function, because it v1 , . . . , vk as columns. implements both clustering tasks mentioned below: Ncut 5: For i = 1 . . . n let yi ∈ Rk be the vector correspond- minimizes the between-cluster similarity and maximizes ing to the i-th row of V . the within-cluster similarity unlike RatioCut, that imple- 6: Cluster the points (yi )i=1,...,n in Rk with the k- ments only first of them. Ncut could be rewritten as a means algorithm into mixed clusters C1 , . . . , Ck . trace minimization problem and leads to retrieving first 7: Extract image clusters from mixed ones. Rank k generalized eigenvectors of images in every cluster use features from corre- sponding feature clusters and data fusion method Lv = λDv (6) CombMNZ. 8: Return set of image clusters I1 , . . . , Ik . Pn here D is a diagonal matrix with Dii = k=1 mik , which is also called degree matrix, and L = D − M It is important to note that we should normalize fea- is unnormalized graph Laplacian matrix. After that, the ture values before applying the algorithm to the graph. desired image clusters can be obtained by running some In fact, most widely used image features such as color routine clustering algorithms such as k-means on these histogram, CPAM histogram and Spatial CPAM, have al- eigenvectors. ready introduced the mechanism of normalization. 4 Future Work [3] E.A. Fox, J.A. Shaw. Combination of multiple searches. In TREC 2 (1994), 243249. First, we will test our framework on real image queries. We plan to use two different databases from CorelPhoto- [4] Guoping Qiu: CBipartite Graph Partitioning and Set collection: ImageDBCorel database(650 images in 9 Content-based Image Clustering. semantic classes) and CorelSmall-100 database(100 im- ages in 16 semantic classes). Second, if our co-clustering [5] Manjeet Rege, Ming Dong, Jing Hua: Clus- and ranking scheme gives good results, we use the fol- tering Web Images with Multi-modal Features. lowing research approach. SIGMM07, September 2328, 2007. As in many other clustering systems, the most diffi- cult question in our co-clustering scheme is ”How will [6] Bin Gao, Tie-Yan Liu, Tao Qin, Xin Zheng1, Qian- we define a number of clusters?”. In our case such clus- Sheng Cheng, Wei-Ying Ma: Web Image Clus- tering algorithms as G-means and X-means, which pro- tering by Consistent Utilization of Visual Features vide a number of clusters, are not applicable because we and Surrounding Texts. SIGMM05, November 6- need this number for eigenvectors search before cluster- 11, 2005. ing algorithm starts. There is some research in spectral [7] Xiaofei He, Wei-Ying Ma, Hongjiang Zhang: Im- clustering, where authors propose to use eigengap in or- ageRank : Spectral Tecnhiques For Structural der to define a correct number of clusters, so this method Analysis of Image Database. can be tested in our framework too. Concerning directly clustering algorithm, K-means is easy-to-implement classical method which can provide a good clustering scheme. In our case its main disad- vantage is the assumption that image clusters are crisp, but it is easy to see that in most cases image clusters are overlaping. Therefore we suppose that fuzzy clustering methods(for example fuzzy K-means) can provide more realevant result in image clustering than crisp ones and we will check this hypothesis. Also our framework should be extended for using multiple feature sets: color histograms and texture fea- tures simultaneously, because only color data is insuf- ficient for retrieval in collections of heterogeneous im- ages. Actually, in this case we have two important prob- lems. First of them is definition a set of features which should be used for clustering. A wide variety of color, texture and shape features have been proposed to rep- resent image content, so selecting the most meaningful ones is really difficult. The second problem is a normal- ization problem. If we combine all our image features values into one image representation vector, this vector should be normalized. Suitable normalization method is needed. 5 Conclusion This research is conducted under the supervision of Boris Novikov. The contribution of this paper is a model of new co-clustering and ranking scheme. This scheme combines different information retrieval technics, such as spectral clustering, ranking and data fusion methods, in order to improve image retrieval effectiveness. We suppose that our search results representation should be realy convinient and helpful. We plan to include the developed framework into CBIR system Foto Finder, which is being created in our research group. References [1] Reginald E. Hammah and John H. Curran: A Tu- torial on Spectral Clustering. Technical Report No. TR-149, August 2006 [2] Ilya Markov, Natalia Vassilieva: Building Up Low- Level Centroids for Groups of Perceptually Similar Images.