Databases Reference
In-Depth Information
customer depended greatly on the customer's gender? ” Notice that he believes time and
location play a role in predicting valued customers, but at what granularity levels do
they depend on gender for this task? For example, is performing analysis using f month,
country g better than f year, state g?
Consider a data table D (e.g., the customer table). Let X be the attributes set for
which no concept hierarchy has been defined (e.g., gender, salary ). Let Y be the class-
label attribute (e.g., valued customer ), and Z be the set of multilevel attributes, that is,
attributes for which concept hierarchies have been defined (e.g., time, location ). Let V
be the set of attributes for which we would like to define their predictiveness. In our
example, this set is f gender g. The predictiveness of V on a data subset can be quantified
by the difference in accuracy between the model built on that subset using X to predict Y
and the model built on that subset using X V (e.g.,f salary g) to predict Y. The intuition
is that, if the difference is large, V must play an important role in the prediction of class
label Y.
Given a set of attributes, V , and a learning algorithm, the prediction cube at granular-
ity h l 1 ,
, l d i (e.g., h year , state i) is a d -dimensional array, in which the value in each cell
(e.g., [2010, Illinois]) is the predictiveness of V evaluated on the subset defined by the
cell (e.g., the records in the customer table with time in 2010 and location in Illinois).
:::
Supporting OLAP roll-up and drill-down operations on a prediction cube is a
computational challenge requiring the materialization of cell values at many different
granularities. For simplicity, we can consider only full materialization. A naïve way to
fully materialize a prediction cube is to exhaustively build models and evaluate them for
each cell and granularity. This method is very expensive if the base data set is large.
An ensemble method called Probability-Based Ensemble (PBE) was developed as a
more feasible alternative. It requires model construction for only the finest-grained
cells. OLAP-style bottom-up aggregation is then used to generate the values of the
coarser-grained cells.
The prediction of a predictive model can be seen as finding a class label that maxi-
mizes a scoring function. The PBE method was developed to approximately make the
scoring function of any predictive model distributively decomposable. In our discus-
sion of data cube measures in Section 4.2.4, we showed that distributive and algebraic
measures can be computed efficiently. Therefore, if the scoring function used is dis-
tributively or algebraically decomposable, prediction cubes can also be computed with
efficiency. In this way, the PBE method reduces prediction cube computation to data
cube computation.
For example, previous studies have shown that the naıve Bayes classifier has an alge-
braically decomposable scoring function, and the kernel density-based classifier has a
distributively decomposable scoring function. 8
Therefore, either of these could be used
8 Naıve Bayes classifiers are detailed in Chapter 8. Kernel density-based classifiers, such as support vector
machines, are described in Chapter 9.
 
Search WWH ::




Custom Search