Information Technology Reference
In-Depth Information
multiple databases was dealt with. In [25] an algorithm for mining association
rules in data warehouses was presented.
Mining association rules in image datasets has been a great challenge. Proce-
dures of association rule mining do not produce interesting results by themselves.
Images should be previously pre-processed by image processing algorithms to
produce the image data that is submitted to the mining processes.
7.2.3
Content-Based Retrieval Evaluation
When working with content-based retrieval, performing exact searches on image
datasets are not useful, since searching for the same data already under analysis
has very few applications. Therefore, the retrieval of complex data is mainly
performed regarding similarity. The most well-known and useful types of simi-
larity queries are the k -nearest neighbor (for instance: “given the Thorax-XRay
of John Doe, find the five images most similar to it from the image database”),
and range queries (for instance: “given the Thorax-XRay of John Doe, find the
images that differ from it up to three units”). Similarity search is performed
comparing the feature vectors using a distance function to quantify how close
(or similar) each pair of vectors is.
This chapter is focused on medical images, more specifically on the feature
vectors employed to compare and retrieve the images by similarity. The moti-
vation is to reduce the usually large number of extracted features, because for
PACS and CAD systems, it is usual to gather as many image characteristics as
possible, leading to high-dimensional feature vectors, which encompasses much
redundant information. Consequently, it is necessary to sift the features that
keep the most meaningful information. Notice that the proposed approach can
be straightforwardly extended to work on other types of complex data beyond
images, since similarity queries are generally the most suitable for complex data.
In this chapter, we present a technique that uses association rules to improve
the content-based image retrieval on medical domain. One important issue re-
lated to CBIR systems consists on how to evaluate their ecacy. A standard
approach to evaluate the accuracy of the similarity queries is the precision and
recall (P&R) graph [5]. Precision and recall are defined in Equation 7.3 and
Equation 7.4.
P recision = TRS
TS
(7.3)
TRS
TR
Recall =
(7.4)
In Equations 7.3 and 7.4, TR is the total number of relevant images for a given
query; TRS is the number of relevant images actually returned in the query, and
TS is the total number of images returned in the query. In our experiments we
use precision and recall (P&R) curves in order to analyze our proposed algorithm
StARMiner.
Search WWH ::




Custom Search