Information Technology Reference
In-Depth Information
4
Data Quality Enhancement Technology to
Improve Decision Support
Ahmad Shahi, Rodziah binti Atan and Nasir bin Sulaiman
University Putra Malaysia
Malaysia
1. Introduction
Uncertainty is a very important aspect of real human life. It means "Not knowing with
certainty; such as cannot be definitely forecast" [1]. This uncertainty in fuzzy systems occurs
mainly due to volume of work, lack of theoretical knowledge and lack of experimental
results [2].
Mendel [3] has noted that uncertainty also exists while building and using typical fuzzy
logic systems. He has described four sources of uncertainty: Uncertainty about the meanings of
the words that are used in a rule , this is the uncertainty with the membership functions because
membership functions represent words in a fuzzy logic system. It can be both antecedents
and consequents; Uncertainty about the consequent that is used in a rule, this is the uncertainty
with the rule itself. A rule in FLS describes the impact of the antecedents on the consequent.
Expert may vary in their opinion to decide this nature of impact; Uncertainty about the
measurements that activate the FLS, this is the uncertainty with the crisp input values or
measurements that activates the FLS. These measurements may be noisy or corrupted. The
noise can again be in a certain range or totally uncertain meaning stationary or non-
stationary; Uncertainty about the data that are used to tune the parameters of a FLS , this is the
uncertainty with the measurements again.
To deal with the uncertainty, fuzzy logic is a proper way to model human thinking.
Although it was introduced by Lotfi Zadeh in 1965, it has been used to build expert systems
for handling ambiguities and vagueness associated with the real world problems which
involve different kinds of uncertainty [4].Thus in order to strengthen fuzzy system model,
quality of data as an input of the model should be enhanced. Outliers and noisy data, these
uncertainty arise from mechanical faults, change in system behavior, fraudulent behavior,
network intrusions, sensor and device error, human error and so on [5, 6]. However, to
strengthen fuzzy system model, outliers should be isolated that, the following section
demonstrates about details of isolating outliers.
1.1 The reason of isolating outliers
The main reason for isolating outliers is associated with data quality assurance. The
exceptional values are more likely to be incorrect. According to the definition, given by
Wand and Wang [7], unreliable data represents an unconformity between the state of the
database and the state of the real world. For a variety of database applications, the amount
of erroneous data may reach ten percent and even more [8]. Thus, removing or replacing
Search WWH ::




Custom Search