Database Reference
In-Depth Information
4.1 Overview of Clustering
In general, clustering is the use of unsupervised techniques for grouping similar
objects. In machine learning, unsupervised refers to the problem of finding hidden
structure within unlabeled data. Clustering techniques are unsupervised in the
sense that the data scientist does not determine, in advance, the labels to apply
to the clusters. The structure of the data describes the objects of interest and
determines how best to group the objects. For example, based on customers'
personal income, it is straightforward to divide the customers into three groups
depending on arbitrarily selected values. The customers could be divided into three
groups as follows:
• Earn less than $10,000
• Earn between $10,000 and $99,999
• Earn $100,000 or more
In this case, the income levels were chosen somewhat subjectively based on
easy-to-communicate points of delineation. However, such groupings do not
indicate a natural affinity of the customers within each group. In other words, there
is no inherent reason to believe that the customer making $90,000 will behave
any differently than the customer making $110,000. As additional dimensions are
introduced by adding more variables about the customers, the task of finding
meaningful groupings becomes more complex. For instance, suppose variables such
as age, years of education, household size, and annual purchase expenditures were
considered along with the personal income variable. What are the natural occurring
groupings of customers? This is the type of question that clustering analysis can
help answer.
Clustering is a method often used for exploratory analysis of the data. In clustering,
there are no predictions made. Rather, clustering methods find the similarities
between objects according to the object attributes and group the similar objects into
clusters. Clustering techniques are utilized in marketing, economics, and various
branches of science. A popular clustering method is k-means.
Search WWH ::




Custom Search