Adaptive Retrieval in a P2P Cloud Datacenter - Multimedia Database Retrieval: Technology and Applications

Database Reference

In-Depth Information

8.3.3

Query Processing on the P2P Network

8.3.3.1

Query Indexing

In the query processing stage (cf. Fig. 8.2 b), the query feature vector is mapped

into the SOTM, similarly as in the indexing stage. However, the multiple-cluster

mapping method is employed here to improve retrieval accuracy. Figure 8.4

illustrates a simplified example of dividing data samples (represented in dots) into

six clusters, where the cluster centers are denoted by crossed-circles. If the query

image (represented as a triangle) is located close to the boundary between multiple

clusters, retrieval precision drops if the search is done for only the closest cluster. As

shown in Fig. 8.4 , relevant images may reside under multiple clusters (such as the

samples covered by the circle surrounding the query). Therefore, if the query image

is located at equal or nearly equal distance to multiple cluster centers, multiple

clusters should be chosen to avoid a significant degradation in the retrieval precision.

When a node issues a query, it converts the query image to a set of cluster IDs,

and then sends the query message to the nodes responsible for the cluster IDs. In

order to obtain the cluster IDs for the query, a multiple-cluster mapping denoted

by:

Z k

ₒ

is employed. This is done by vector quantization of the query feature

vector, f q :

(

f q )= {

clusID 1 ,

clusID 2 ,...,

clusID k }

(8.5)

where clusID 1 is the closest (winning) cluster to the query, i.e.,

arg mi i f q −

w i

clusID 1 ≡

(8.6)

and clusID 2 and clusID k are respectively the cluster ID of the second and the k -th

neighbor of the winning cluster.

Afterwards,

the

node

sends

the

query

messages

the

form

key i ,

IP address

, where i

,...,

k , and key i is obtained by:

key i =

SHA1

(

clusID i ) ,

,...,

(8.7)

Using the query routing scheme in the Chord, each query message is only forwarded

to the node responsible for the keys. The number of query messages depends on the

parameter k . Increasing k also expands the number of clusters that are close to the

query, and thus, increase the probabilities of finding the relevant images.

Figure 8.5 shows the precision results as a function of k , the number of clusters

selected for an indexing query in Eq. ( 8.5 ). The database was the Corel database

[ 224 ], containing 40,000 images. Each image was characterized by color histogram,

color moment, Gabor wavelet for texture, and Fourier descriptor, as discussed in

Chap. 2 (Table 2.7 ). The application of SOTM resulted in 233 clusters (the number

of clusters was automatically generated) [ 359 ]. The figure shows the average

Multimedia Database Retrieval: Technology and Applications

Search WWH ::

Custom Search

Home