Database Reference
In-Depth Information
Data reduction is also widely used in marketing research. The views, per-
ceptions, and preferences of the respondents are often recorded through a large
number of questions that investigate all the topics of interest in detail. These
questions often have the form of a Likert scale, where respondents are asked
to state, on a scale of 1-5, the degree of importance, preference, or agreement
on specific issues. The answers can be used to identify the latent concepts that
underlie the respondents' views.
To further explain the basic concepts behind data reduction techniques, let us
consider the simple case of a few customers of a mobile telephony operator. SMS,
MMS, and voice call traffic, specifically the number of calls by service type and
the minutes of voice calls, were analyzed by principal components. The modeling
dataset and the respective results are given in Table 2.9.
The PCA model analyzed the associations among the original fields and
identified two components. More specifically, the SMS and MMS usage appear to
be correlated and a new component was extracted to represent the usage of those
services. Similarly, the number and minutes of voice calls were also correlated.
The second component represents these two fields and measures the voice usage
intensity. Each derived component is standardized, with an overall population
mean of 0 and a standard deviation of 1. The component scores denote how many
standard deviations above or below the overall mean each record stands. In simple
terms, a positive score in component 1 indicates high SMS and MMS usage while a
negative score indicates below-average usage. Similarly, high scores on component
Table 2.9 The modeling dataset for principal components analysis and the derived
component scores.
Input fields
Model-generated fields
Customer Monthly
Monthly
Monthly
Monthly
Component Component
ID
average
average
average
average
1score-
2score-
number of number of number of number of
''SMS/MMS
''voice
SMS calls MMS calls
voice calls
voice call
usage''
usage''
minutes
1
19
4
90
150
0.57
1.99
2
43
12
30
35
0.61
0.42
3
13
3
10
20
0.94
1.05
4
60
14
100
80
1.34
1.38
5
5
1
30
55
1.27
0.29
6
56
11
25
35
0.78
0.48
7
25
7
30
28
0.25
0.57
8
3
1
65
82
1.23
0.65
9
40
9
15
30
0.22
0.76
10
65
15
20
40
1.33
0.46
Search WWH ::




Custom Search