Information Technology Reference
In-Depth Information
mechanisms to store and retrieve them. Therefore, content-based image retrieval
(CBIR) techniques have been intensively investigated in the last years [16].
CBIR techniques rely on image processing algorithms to extract relevant char-
acteristics (features) from the images. The characteristics are grouped into fea-
ture vectors, which are stored and organized by indexing structures aiming at
achieving fast and e cient image retrieval. Generally, CBIR techniques use in-
trinsic visual features of images, such as color, shape and texture [13] yielding
vectors with hundreds or even thousands of features. Unlike one would think,
having a large number of features actually represents a problem. As the number
of the extracted features grows, the process of storing, indexing, retrieving, and
comparing them becomes more and more time consuming. Moreover, in several
situations, many features are correlated, meaning that they bring redundant in-
formation about the images that can deteriorate the ability of the system to
correctly distinguish them. The large number of features leads CBIR systems to
face the problem known as the “dimensionality curse” [17]. Beyer [7] has proved,
as the number of features increases, the significance of each feature tends to
diminish. Hence, it is important to keep the number of features as low as pos-
sible, establishing a tradeoff between the representation power and the feature
vector size.
Image features are also commonly employed in the classification task. A signif-
icant example is the classification of tumor masses detected in mammograms as
benign or malignant. Initially, the radiologist classifies the images based on the
shape of the lesion. Malignant tumors generally infiltrate the surrounding tissue,
resulting in an irregular or hardly-distinguishable contour, while benign masses
have a smooth contour. Figure 7.1 illustrates two examples of tumor masses.
This chapter discusses how to apply techniques of mining statistical asso-
ciation rules to improve content-based image retrieval in medical domain. We
present a new algorithm (the StARMiner - S tatistical A ssociation R ule M iner)
to determine a minimal set of representative features. The algorithm uses sta-
tistical measurements, which describe the behavior of the features considering
the image categories, to find representative rules. We compare the ecacy of
StARMiner and other well- known feature selection algorithms, Relief-F and
DTM (Decision Tree Method) in the task of feature selection using a case
study.
Fig. 7.1. Typical breast tumor masses: benign (left) and malignant (right)
 
Search WWH ::




Custom Search