Mining Statistical Association Rules to Select the Most Relevant Medical Image Features - Mining Complex Data

Information Technology Reference

In-Depth Information

7.2

Background

Image mining has been the target of many researches in the field of data mining

and information retrieval in current years. A major challenge of the image mining

field is to effectively relate low-level features (automatically extracted from image

pixels) to high-level semantics based on the human perception.

According to [11], researches in image mining can be generally classified

into two main directions: domain-specific and general-purpose directions. The

domain-specific direction focuses on image processing techniques, where the goal

is to process the image and to extract the features that best contribute to dif-

ferentiate images from different types. The general-purpose direction focuses on

mining algorithms that aim at reducing the semantic gap between high-level

human perception of images and low-level image feature representation. Indeed,

general-purpose techniques work improving the accuracy of specific-domain tech-

niques, working in a complementary way.

Mining images demands the extraction of their main features regarding spe-

cific criteria. After extracted, the feature vector and the image descriptions are

submitted to the mining process.

When working with image databases, high-level data manually supplied by

domain experts can also be employed together with low-level features in image

mining processes. However, with the growing of large-scale image repositories,

manual annotation of images has become unfeasible because of its inherent prob-

lems of subjectivity, non-scalability, and non-uniformity of vocabulary. CBIR sys-

tems are proposed to overcome these limitations, where the most similar images

of a given one are retrieved based on comparisons of visual features (automati-

cally extracted from images). The retrieved images can be employed to label a

new one or, in case of medical images, to help the decision making process of

diagnosing a new image.

In this section, two tasks of data mining are discussed: feature selection and

association rule mining. We concentrate on applying these techniques to extract

patterns from images and to improve content-based search in medical image

databases. In this section, we also describe how to evaluate the results of a CBIR

system using the precision versus recall curves (P&R). We employed P&R graphs

to evaluate our experiments.

7.2.1

Feature Selection

Dimensionality reduction (also called dimension reduction) is the process of re-

ducing the number of features (attributes) used to represent a dataset under

consideration. Dimensionality reduction approaches diminish the feature vector

size by removing redundant, correlated and noisy data. In most cases, dimension-

ality reduction speeds up the processing of data mining algorithms and improves

their accuracy.

The dimensionality reduction approaches are also classified into feature selec-

tion and feature transformation approaches. There is no agreement about the

Mining Complex Data

Search WWH ::

Custom Search

Home