A content-based image retrieval approach based on document queries - Emerging Trends in Image Processing, Computer Vision, and Pattern Recognition

Image Processing Reference

In-Depth Information

• model based (MRFs [ 5 ] , fractals).

One of the most widely embraced approach is to use local binary paterns [ 6 ] .

In what regards the local descriptors, probably the most famous algorithm (scale invariant

feature transform—SIFT) was introduced by Lowe [ 7 ] . Since then, many approaches have been

developed. Some of the most popular ones are based on speeded up robust feature (SURF)

[ 8 ] , histogram of oriented gradients (HOG) [ 9 ], gradient location and orientation histogram

(GLOH) [ 10 ] , or local energy-based shape histogram (LESH) [ 11 ] .

3 Our approach

The proposed approach targets to classify a mixed set of images, containing real world scenes

and document scans. The system mainly follows the standard CBIR architecture as it can be

seen in Figure 1 . It is composed of two interconnected submodules:

FIGURE 1 The basic system architecture.

• the training and learning module;

• the document classification module.

A valid use case scenario contains the below stages:

• the system is trained on a set of images;

• each image is analyzed and decomposed in relevant descriptors;

• the descriptors are provided as input to a machine learning module, which is in charge of

setting the class boundaries;

• each new regular image (not document) is decomposed and classified accordingly;

• each new document scan is preprocessed and segmented. The extracted images are then

classified

• the system extracts the 10 most relevant results and provides them as an answer to the user

query.

The indexing process is based on supervised machine learning and is conducted on regular

images. The user is allowed to enter queries based on both image types.

We are using a mixed set of image characteristics:

• different color spaces;

• texture space;

• local descriptors.

We have not used any shape descriptors, as the preliminary tests showed that in this area

these do not produce a noticeable improvement. The main problem was caused by the fact

that the objects contained in the images may be affected by problems like occlusion or clutter.

In the color descriptors area, we have used four sets of characteristics, as it follows:

• c1c2c3 and l1l2l3 . As explained above, these color spaces are very useful when applied on

real world images. The coordinates are described by the equations below:

Search WWH ::

Custom Search

Home