Databases Reference
In-Depth Information
3.7 Exercises
3.1 Data quality can be assessed in terms of several issues, including accuracy, completeness,
and consistency. For each of the above three issues, discuss how data quality assess-
ment can depend on the intended use of the data, giving examples. Propose two other
dimensions of data quality.
3.2 In real-world data, tuples with missing values for some attributes are a common
occurrence. Describe various methods for handling this problem.
3.3 Exercise 2.2 gave the following data (in increasing order) for the attribute age : 13, 15,
16, 16, 19, 20, 20, 21, 22, 22, 25, 25, 25, 25, 30, 33, 33, 35, 35, 35, 35, 36, 40, 45, 46,
52, 70.
(a) Use smoothing by bin means to smooth these data, using a bin depth of 3. Illustrate
your steps. Comment on the effect of this technique for the given data.
(b) How might you determine outliers in the data?
(c) What other methods are there for data smoothing ?
3.4 Discuss issues to consider during data integration .
3.5 What are the value ranges of the following normalization methods ?
(a) min-max normalization
(b) z-score normalization
(c) z-score normalization using the mean absolute deviation instead of standard devia-
tion
(d) normalization by decimal scaling
3.6 Use these methods to normalize the following group of data:
200, 300, 400, 600, 1000
(a) min-max normalization by setting min D 0 and max D 1
(b) z-score normalization
(c) z-score normalization using the mean absolute deviation instead of standard devia-
tion
(d) normalization by decimal scaling
3.7 Using the data for age given in Exercise 3.3, answer the following:
(a) Use min-max normalization to transform the value 35 for age onto the range
[0.0, 1.0].
(b) Use z-score normalization to transform the value 35 for age , where the standard
deviation of age is 12.94 years.
(c) Use normalization by decimal scaling to transform the value 35 for age .
(d) Comment on which method you would prefer to use for the given data, giving
reasons as to why.
 
Search WWH ::




Custom Search