Databases Reference
In-Depth Information
(b) The RFID data may contain lots of redundant information. Discuss a method
that maximally reduces redundancy during data registration in the RFID data
warehouse.
(c) The RFID data may contain lots of noise such as missing registration and misread
IDs. Discuss a method that effectively cleans up the noisy data in the RFID data
warehouse.
(d) You may want to perform online analytical processing to determine how many TV
sets were shipped from the LA seaport to BestBuy in Champaign, IL, by month ,
brand , and price range . Outline how this could be done efficiently if you were to
store such RFID data in the warehouse.
(e) If a customer returns a jug of milk and complains that is has spoiled before its expi-
ration date, discuss how you can investigate such a case in the warehouse to find out
what the problem is, either in shipping or in storage.
4.12 In many applications, new data sets are incrementally added to the existing large
data sets. Thus, an important consideration is whether a measure can be computed
efficiently in an incremental manner. Use count, standard deviation , and median as
examples to show that a distributive or algebraic measure facilitates efficient incremental
computation, whereas a holistic measure does not.
4.13 Suppose that we need to record three measures in a data cube: min() , average() , and
median() . Design an efficient computation and storage method for each measure given
that the cube allows data to be deleted incrementally (i.e., in small portions at a time)
from the cube.
4.14 In data warehouse technology, a multiple dimensional view can be implemented by
a relational database technique ( ROLAP ), by a multidimensional database technique
( MOLAP ), or by a hybrid database technique ( HOLAP ).
(a) Briefly describe each implementation technique.
(b) For
each
technique,
explain
how
each
of
the
following
functions
may
be
implemented:
i. The generation of a data warehouse (including aggregation)
ii. Roll-up
iii. Drill-down
iv. Incremental updating
(c) Which implementation techniques do you prefer, and why?
4.15 Suppose that a data warehouse contains 20 dimensions, each with about five levels of
granularity.
(a) Users are mainly interested in four particular dimensions, each having three fre-
quently accessed levels for rolling up and drilling down. How would you design a
data cube structure to support this preference efficiently?
(b) At times, a user may want to drill through the cube to the raw data for one or two
particular dimensions. How would you support this feature?
 
Search WWH ::




Custom Search