Biomedical Engineering Reference
In-Depth Information
K-means Algorithm
The K-means clustering [ 22 , 23 ] is a classical clustering algorithm. After an initial
random assignment of example to K clusters, the centers of clusters are computed
and the examples are assigned to the clusters with the closest centres. The process
is repeated until the cluster centers do not significantly change. Once the cluster
assignment is fixed, the mean distance of an example to cluster centers is used as
the score. Using the K-means clustering algorithm, different clusters were specified
and generated for each output class. Input: The number of clusters K and a dataset
for intrusion detection Output: A set of K clusters that minimizes the squared—
error criterion.
Algorithm:
1. Initialize K clusters (randomly select k elements from the data).
2. While cluster structure changes, repeat from 2.
3. Determine the cluster to which source data belongs. Use Euclidean distance
formula. Add element to cluster with min [Distance (xi, yj)].
4. Calculate the means of the clusters.
5. Change cluster centroids to means obtained using Step 3.
The main disadvantage of K-Mean algorithm is that it may take a large number
of iterations through dense datasets, before it can converge to produce the optimal
set of centroids. This can be inefficient on large datasets due to its unbounded
convergence of cluster centroid.
Negative Selection Algorithm for Intrusion Detection
The negative selection algorithms [ 23 , 24 ] are inspired by the main mechanism in
the thymus that produces a set of mature T-cells capable of binding only non-self
antigens. The first negative selection algorithm was proposed by [ 25 ] to detect data
manipulation caused by a virus in a computer system. The starting point of this
algorithm is to produce a set of self-strings, S, that define the normal state of the
system. The task then is to generate a set of detectors, D, that only bind/recognize
the complement of S. These detectors can then be applied to new data in order to
classify them as being self or nonself, thus in the case of the original work by
Forrest et al. highlighting the fact that data has been manipulated. The algorithm of
Forrest et al. produces the set of detectors via the process outlined below.
Input: S seen = set of seen known self-elements
output: D = set of generated detectors
Search WWH ::




Custom Search