Advantages of Information Granulation in Clustering Algorithms - Agents and Artificial Intelligence

Information Technology Reference

In-Depth Information

Granular computing is a new multidisciplinary theory rapidly developing in recent

years. The most common definitions of GrC [10], [14] include an assumption of com-

puting with information granules, that is collections of objects, which exhibit similarity

in terms of their properties or functional appearance. Although the term is new, the

ideas and concepts of GrC have been used in many fields under different names: infor-

mation hiding in programming, granularity in artificial intelligence, divide and conquer

in theoretical computer science, interval computing, cluster analysis, fuzzy and rough

set theories, neutrosophic computing, quotient space theory, belief functions, machine

learning, databases, and many others. According to the more universal definition, gran-

ular computing may be considered a label of a new field of multi-disciplinary study

dealing with theories, methodologies, techniques and tools which make use of granules

in the process of problem solving [2].

Distinguishable aspect of GrC is a multi-perspective standpoint on data. Multi-

perspective means diverse levels of resolution depending on saliency features or grade

of details of a studied problem. Data granules, which are identified on different levels

of resolution form a hierarchical structure expressing relations between data objects.

This structure can be used to facilitate investigation and helps to understand complex

systems. Understanding of analyzed problem and attained results is the main aspect of

human-oriented systems. In addition, there are also definitions of granular computing

focused on systems supporting human beings [2]. According to definitions mentioned

above, such methodology allows to ignore irrelevant details and concentrate on the es-

sential features of the systems to make them more understandable.

There have been many attempts to solve problems with data granulation. To give a

few examples: knowledge exploration in spatio-temporal databases [8], intelligent fault

detection system [7], image segmentation [13], data mining [11]. In [1] the approach to

data granulation based on approximating data by multi-dimensional hyperboxes is pre-

sented. The hyperboxes represent data granules formed from the data points focusing

on maximizing density of information present in the data. It benefits from the improve-

ment of computational performance, among others. The algorithm is described in the

following sections.

This article examines an approach to data clustering based on processing granules

of data in the form of hyperboxes. This solution is characterized by reduced time in

contrary to processing point-type data. Experiments have been performed on several

multi-dimensional data sets containing different numbers of clusters. They have been

examined both the time of data clustering and the quality of results measured by quality

indices. The article also discusses the way of creating hierarchical structure of data

containing levels of point-type object clusters as well as groups of hyperboxes.

This paper is organized as follows: next section, Section 2, describes the method of

hyperboxes creation, Section 3 contains description of clustering methods: traditional -

partitioning (Section 3.1) and hierarchical (Section 3.2) and one of recently proposed

- SOSIG (Section 3.3). The following part, Section 4, describes indices for assessment

clustering results. Section 5 reports on collected data sets and executed experiments.

The last section concludes the article.

Agents and Artificial Intelligence

Search WWH ::

Custom Search

Home