Image Processing Reference
In-Depth Information
• model based (MRFs [ 5 ] , fractals).
One of the most widely embraced approach is to use local binary paterns [ 6 ] .
In what regards the local descriptors, probably the most famous algorithm (scale invariant
feature transform—SIFT) was introduced by Lowe [ 7 ] . Since then, many approaches have been
developed. Some of the most popular ones are based on speeded up robust feature (SURF)
[ 8 ] , histogram of oriented gradients (HOG) [ 9 ], gradient location and orientation histogram
(GLOH) [ 10 ] , or local energy-based shape histogram (LESH) [ 11 ] .
3 Our approach
The proposed approach targets to classify a mixed set of images, containing real world scenes
and document scans. The system mainly follows the standard CBIR architecture as it can be
seen in Figure 1 . It is composed of two interconnected submodules:
FIGURE 1 The basic system architecture.
• the training and learning module;
• the document classification module.
A valid use case scenario contains the below stages:
• the system is trained on a set of images;
• each image is analyzed and decomposed in relevant descriptors;
• the descriptors are provided as input to a machine learning module, which is in charge of
setting the class boundaries;
• each new regular image (not document) is decomposed and classified accordingly;
• each new document scan is preprocessed and segmented. The extracted images are then
classified
• the system extracts the 10 most relevant results and provides them as an answer to the user
query.
The indexing process is based on supervised machine learning and is conducted on regular
images. The user is allowed to enter queries based on both image types.
We are using a mixed set of image characteristics:
• different color spaces;
• texture space;
• local descriptors.
We have not used any shape descriptors, as the preliminary tests showed that in this area
these do not produce a noticeable improvement. The main problem was caused by the fact
that the objects contained in the images may be affected by problems like occlusion or clutter.
In the color descriptors area, we have used four sets of characteristics, as it follows:
c1c2c3 and l1l2l3 . As explained above, these color spaces are very useful when applied on
real world images. The coordinates are described by the equations below:
 
Search WWH ::




Custom Search