Scalable Indexing of HD Video - High-Quality Visual Experience

Information Technology Reference

In-Depth Information

The similarity measure for this descriptor is a linear combination of metrics for

histogram comparison (e.g. Bhattacharya coefficient):

x h sb ( O t , i )( x ) h sb ( O t , j )( x )

sb ( O t , i , O t , j )=

(16)

Here sb stands for LL or HF and x is the bin index. Finally, the similarity measure

is expressed as

( O t , i , O t , j )=

αρ LL +(1

− α

)

ρ HF

(17)

2.3

On Scalable Content-Based Queries

A vast literature is devoted to image retrieval in large databases and much work

on video retrieval is being done. In our previous work we specifically focused on

retrieval of objects in video content [14]. In this chapter two questions in the object-

based framework: query by clip and scalable queries are addressed.

The retrieval scenario considered consists of searching for a clip in a HD video

database containing a query object. This scenario can for instance be used for detec-

tion of a fraudulent post-production, where an object is extracted from a video clip

frame by frame and inserted into the background extracted from another sequence.

Let us consider a clip C DB in a video database (DB). A set of objects masks

O DB =

{

O t , i , t = t 0 , t 0 +

t ,...

}

is extracted for each object at each level of the

wavelet pyramid. The histogram features H DB are computed and stored as metadata.

Let us then consider a query clip C Q and histogram features H Q of objects ex-

tracted from this clip. The user is invited to select an image I t ∗ ∈

C Q in which the

object extraction result is visually the most satisfactory.

We consider both mono-level and cross-level search. In the case of mono-level

search, the descriptor H Q at a given pyramid level k is compared to all the descriptors

available in the DB at the same level. We call this query a “mono-level” query.

Hence, a clip from the DB is the response to the query clip for a given resolution

if at least one of its frames is a “good” response to the query. The “goodness” of

a response is measured in comparison with a given threshold. This scenario is well

adapted to the scalable script in the case when the query is not transmitted with

full-resolution.

The “cross-level” search consists in comparison of a query descriptor extracted at

a chosen resolution level k with descriptors in DB extracted at a specified resolution

level. First of all, this type of query is interesting for a “light” processing at a client

side. The query object can be extracted on low resolution levels of wavelet pyramid

while the high resolution descriptors in the DB will be used for retrieval at server

side. Inversely, if the high-resolution descriptors are available in the original clip

(e.g. stored in the video archive), it can be compared with a low-resolution collection

of videos when searching for a fraudulent low-quality video.

In [26] main stream retrieval consisting of matching of SIFT descriptors extracted

on object masks and the global descriptor, i.e. a pair of wavelet histograms are

compared. It turns out, that firstly the HF histogram is necessary (0 <

< 1in

High-Quality Visual Experience

Search WWH ::

Custom Search

Home