当前位置:文档之家› 利用DCT系数计算两幅图像之间的相似度

利用DCT系数计算两幅图像之间的相似度

Visual image retrieval on compressed domain with Q-distanceHong Heather YuPanasonic Information and Networking Technology Lab.heathery@AbstractThis paper proposes a new image retrieval scheme that works directly on compressed image(JPEG)databases.As we know,a large percentage of the image databases are stored in compressed image format,such as JPEG format.In addition,about half of the images on the Internet are also in JPEG format.Thus,image retrieval systems that require JPEG decompression greatly limit the speed of image searching.Subsequently,new methodologies for retrieving of images without JPEG decoding is needed for web image search and compressed image database retrieval.In this paper,we propose a new metric,Q-distance, that can be utilized to measure the distance between two compressed images.A system that uses Q-distance for fast image retrieval is also presented.Experiment results show that Q-distance is robust against variation and this new retrieval scheme,which directly works on compressed image domain,is fast to execute and suitable for web image searching and retrieval.1.Introduction1.1MotivationA study by Euro-marketing shows that there are over157million people worldwide who have access to the Internet,the gigantic multimedia information database.Needless to say,one of the most important functions of the Internet is'search'.The overwhelmingly available multimedia on such high traffic Internet demand fast searching and browsing capability of text,audio,as well as visual data.Since most of the images on the Internet are in compressed formats,it is therefore important to develop techniques that can allow visual image searching without image decompression,that is,directly search on compressed image domain.As we know,a compressed image,such as JPEG image,can compress an image whereas keeping the visual quality of it by discarding the small high frequency coefficients.This means by throwing away the least significant coefficients,the visual appearance of an image does not change significantly,i.e.,the overall structure of an image is kept in the significant coefficients.Is this characteristic useful in designing similarity-based image retrieval systems?Can this property be employed to design a compressed-domain image search engine?In this paper,we present such an image search engine and show that this characteristic of visual media is indeed helpful in designing a compressed-domain image retrieval ware.Why?With regards to image retrieval,many real world scenarios emphasize on the similarity of the overall structure of images.For instance,on web image searching,users may have a rough idea of the image they are looking for.Hopefully,a simple sketch of the overall structure of the image can help them to find the image in the database.This requires a good distance measurement between the query sketch and the images in the database.In this paper,wepropose anew metric,Q-metric,for domain.It is defined based on the analysis of the aforementioned visual characteristics of image and gives a measurement of how many SFCs (Significant Frequency Component)of two images are in common.As a result,it gives a good measurement on the overall similarity between images.By directly measuring the distance on compressed domain,it significantly enhances image query speed.Consequently,it gives higher usability for compressed image database retrieval,such as web image searching.1.2Related worksResearch works on visual content-based image retrieval[1,2,3,4,5,6,7,8,9,10]started several years ago.One of the application areas is web-image search engine.Yahoo image surfer by Excalibour [11],the MIQ by the University of Washington [12],VisualSEEk by Columbia University [13],and etc.for web image searching have made great progress in this area.In particular,the MIQ [11]system by Jacobs,Finkelstein,and Salesin designed a new metric for querying images that essentially compares how many significant wavelet coefficients the query has in common with potential targets.Their experiment results showed dramatic improvement in both speed and success rate,over using the conventional L1,L2,or color histogram norm.However,one drawback of this system is that it works on raw images instead of compressed images while most of the images on the World Wide Web are in either JPEG format or GIF format.This greatly degraded the efficiency of the system.In this paper,we propose a new metric that works on compressed image (JPEG format)directly,which,from the application point of view,can significantly improve the performance of web image searching and compressed image database retrieval.In the next section,the definition of Q-distance along with the description on using Q-distance for image retrieval is given.Thereafter,we outline the system architecture for compressed domain image retrieval with Q-distance .Experiment results will be presented in the last section followed by conclusion remarks.2.Q-metricLet I 1,I 2…I N represent the images in the database and Q represent the query image.Assume the image size is XxY .Denote I n 00(i,j)to be the DC coefficient of the (i,j)th block and I n lk (i,j)to be the coefficient of the (l,k)th channel of the (i,j)th block of image I n .Here,l ∈[,]18,k ∈[,]18,i I ∈[,]1,and j J ∈[,]1.Notice that I X ×=8and J Y ×=8.The DC coefficient of each block can form a new image,DC-image I DC n of an original image I n .To define the Q-metric ,wavelet transformation is performed on the DC-images of the query and the target images.Let's denote the wavelet coefficient of the DC-image as I DC n*00(i,j).Q-metric,which measures the distance between the query image and the target image,is thus defined asQ I Q i j I i j Q i j I i j m DC DC i j lk l kl k l k i j ,((,),(,))((,),(,))**,,,*,*,=+åååωδωδ00Where ωlk are weighting functions,and the single channel distance function δis defined as following:δ(,)Q I =1,when Q DC *(i,j)>T*,I DC *(i,j)>T*and Q lk (i,j)>T ,I lk (i,j)>T with threshold T*and T ;δ(,)Q I =0,otherwise.Here,we refer the distance between two images Q &I that is computed using Q-metric as Q-distance :Q Q I Q Q Q I(,),,=−A fast image retrieving system that directly works on compressed image is presented in the next section.In this system,Q-distance is employed to measure the visual similarity of two images and therefore is used to retrieve similar images of the query image in the system.During the retrieval phase,image I M is returned as the best matching image of the query image Q if,Q Q I Q Q I for m N M m (,)(,)[,]≤∀∈1,i.e.,if.|,||,|[,]Q I Q I for m N M m ≥∀∈13.The systemThe query system utilizes the above-defined metric for similarity-based image retrieval.As we mentioned in the first section,a compressed image,such as JPEG image,can compress an image and keep the visual quality of it by discarding the small high frequency coefficients.This is exactly the useful characteristic we employed to design a compressed-domain image search engine.By means of recognizing the important coefficients of an image,the above-defined Q -metric is able to capture the distance of the overall structure of two images.It in return gives a good measure of the similarity between two images.The metric uses both the wavelet coefficients of the DC-image and the AC coefficients of DCT transformation.Since the DC coefficient as well as the AC coefficients of an image can be gotten directly from the JPEG without decoding,the performance of the system that utilizes the above-defined Q -metric is greatly enhanced.3.1Q-distance for image similarity retrievalFigure 1.Image retrieval system layoutFigure 1outlines an image retrieval system.The database consists of JPEG compressed images only.In this system,a 2-D standard Haar wavelet decomposition is first performed on the DC-image (see section 3for definition of DC-image.)Next,the Q-distance between the query image Q and each of the potential target image in the database I 1,I 2…I N is At last,a of images is returned to the user based on a winner first strategy,i.e.,I M is returned if the Q-distance of I M to Q is among the K smallest Q-distance s of all N images.i.e.,Q Q I Q Q I M m (,)(,)≤which equivalents to |,||,|[,]Q I Q I m N m M m ≥∀∈∉for and 1MWhere M represents the returned image set.3.2Web image search engineIn web image searching,two important factors need to be considered:speed and interface.The interface problem is beyond the scope of this paper.However,the advantage of searching directly on compressed images will no doubt boost up the performance.4.Results and summary4.1Experiment resultOur first set of experiments is comparison student between the Q-distance and visual similarity.In Figure 2(c),|Q,I m |between a query image Q and 32other images in the database are plotted.Figure1(b)shows several sample images with I(I 5),II(I 9),IV(I 11),and V(I 15)have a large Q-distance (small |Q,I m |)and III(I 10)and V(I 21)have small Q-distance s to Q=I 31.Figure 3shows a sample retrieval result.The query image shown in (a)is a sketch that is painted by user.The 9images in (b)are the first nine images on the returning list.(c)and (d)give the sample Q-distance plots with (c)plots the Q-distance s between the sample query image Q1shown in (a)and the first 33images in the database,whereas (d)shows the result of ordered Q-distance s for retrieval.Experimental results show that the retrieval system that uses the Q-distance to measure the similarity between two images outperforms those using L1or L2distance.In addition,this system goes one step further.It performs searching and retrieval on the compressed images which is fast to execute and suitable for web image searching.4.2Future workCurrently we are working on testing this system on a large image database.In the mean time,the same methodology can be extended to similarity-based video clip retrieval.The video retrieval system that works directly on MPEG video is also under testing.Figure 2.A comparison study:plot of |Q,I m |References[1].H.J.Zhang,C.Low,S.Smoliar,"Automatic parsing of news video",in Proceedings,IEEE ICMCS'94,1994,P45-54[2].J.Dowe,"Content-based retrieval in multimedia imaging",in Proceedings,SPIE,Visual Communication andImage Processing,1993[3].M.Flickner,et al,"Query by image and video content:the QBIC system",IEEE Computer,1995(a)Query image Q(b)Image I 5(|Q,I m |=29),I 9(|Q,I m |=24),I 10(|Q,I m |=150),I 11(|Q,I m |=30),I 15(|Q,I m |=11),I 21(|Q,I m |=120)in comparison with I 31in(b)(c)Plot of |Q,I m |with (the first 33images in the database)and Q=I 31I II IIIIV V VI[4].J.Smith,S.-F.Chang,"VisualSEEk:a fully automated content-based image query system",in Proceedings,ACM Multimedia'96,1996,P87-96[5].T.S.Huang,S.Mehrotra,K.Ramchandram,"Multimedia analysis and retrieval system (MARS)project",inProceedings,Clinic on Library Application of Data Processing,1996[6].J.R.Bach,C.Fuller,A.Gupta,A.Hampapur,B.Horowitz,R.Humphrey,R.Jain,C.F.Shu,"The virageimage search engine:an open framework for image management",in Proceedings,SPIE,1996[7] Cascia,E.Ardizzone,"JACOB:Just a content-based query system for video databases",ICASSP'96,1996[8].T.P.Minka,R.W.Picard,"Interactive learning using a 'Society of Models'",Pattern Recognition,V30,N4,1997[9].H.Yu,W.Wolf,"A visual search system for video and image databases",in Proceedings,IEEE ICMCS'97,1997[10].J.Krey,et al,"Video Retrieval by still image analysis with ImageMiner",in Proceedings,SPIE'97,1997[11]./Figure 3.A query sample result*Note:In the above examples,the size of each image is 640x480.For illustration purpose,the images shown are several times smaller than their actual sizes and only the first 33images in the retrieving database are plotted.The matching point withminimumQ-distance(a)Query image Q1,a rough(b)Returned first 9images(c)Plot of Q-distances between the sample querysketch Q1shown in (a)and the first 33images inthe database(d)Plot of ordered Q-distance s of the sample query in (a)。

相关主题