Biomedical Engineering Reference
In-Depth Information
1500
1000
500
0
-500
600
0
575
3
550
6
525
500
9
475
12
450
15
FIGURE 5.7
Continuous calibration surface for the bR photocell generated by 2
25
1 RBF network. The inputs are wave-
length and intensity, and the output is photocell response.
the trained classifier. One validation technique is based on the principle of “leave-one-
out.” In this case, a small number of data vectors are randomly removed from the original
dataset and used to test the classifier model. This is essential the approach used for test-
ing a trained neural network. In some implementations, the process data is repeated until
every data point has been left out once for testing the classifier and the estimated accuracy
for each test is then averaged for final performance assessment.
Alternatively, unsupervised pattern classification has become a popular approach for
exploratory data analysis where class assignment is largely unknown or when it is neces-
sary to reduce the volume or dimensionality of the sensor data into a manageable size for
meaningful analysis. These techniques are referred to as data grouping, clumping, and
cluster analysis (8). The primary goal is to identify groups of similar data vectors that exist
in a common feature space by projecting them into a lower dimensional space that reflects
the novel attributes in the original data.
Most unsupervised classification techniques search for clusters that are disjoint or
mutually exclusive, as opposed to clusters that overlap due to clumping. Once identified,
it is possible to reduce the number of elements necessary for visual display by restricting
the view to only the clusters and not the original data. The mapping of large numbers of
data vectors into a limited number of clusters provides the analyst with an overview of the
data structure, and permits the observer to retain proper context while reducing visual
complexity. Most clustering algorithms seek to find a balance between the number of clus-
ters and the number of data vectors assigned to each cluster. A small number of clusters
will permit fast information processing but the information content that is visible to the
analyst will become very low. Critical details for some applications can be lost in this
transformation and “lossless” techniques must be used.
Two exploratory clustering methods commonly used to identify groups of data vectors
that form patterns in the feature space are K -means clustering algorithm and Kohonen's
SOFM. K -means is a clustering algorithm that finds groups of similar vectors that are
represented by their respective cluster center, which is computed as the mean of the data
vectors assigned to it (8). The SOFM learns patterns by forming cluster centers through an
Search WWH ::




Custom Search