Database Reference
In-Depth Information
Table 3.1 Behavioral fields used in the PCA example.
Field name
Description
VOICE_OUT_CALLS
Monthly average of outgoing voice calls
VOICE_OUT_MINS
Monthly average number of minutes of outgoing
voice calls
SMS_OUT_CALLS
Monthly average of outgoing SMS calls
MMS_OUT_CALLS
Monthly average of outgoing MMS calls
OUT_CALLS_ROAMING
Monthly average of outgoing roaming calls (calls
made in a foreign country)
GPRS_TRAFFIC
Monthly average GPRS traffic
PRC_VOICE_OUT_CALLS
Percentage of outgoing voice calls: outgoing voice
calls as a percentage of total outgoing calls
PRC_SMS_OUT_CALLS
Percentage of SMS calls
PRC_MMS_OUT_CALLS
Percentage of MMS calls
PRC_INTERNET_CALLS
Percentage of Internet calls
PRC_OUT_CALLS_ROAMING
Percentage of outgoing roaming calls: roaming
calls as a percentage of total outgoing calls
Statistical Hypothesis Testing and Significance
Statistical hypothesis testing is applied when we want to make inferences
about the whole population by using sample results. It involves the for-
mulation of a null hypothesis that is tested against an opposite, alternative
hypothesis. The null hypothesis states that an observed effect is simply due
to chance or random variation of the particular dataset examined.
As an example of statistical testing, let us consider the case of the
correlations between the phone usage fields presented in Table 3.2 and
examine whether there is indeed a linear association between the number
and the minutes of voice calls. The null hypothesis to be tested states that
these two fields are not (linearly) associated in the population. This hypothesis
is to be tested against an alternative hypothesis which states that these two
fields are correlated in the population. Thus the statistical test examines the
following statements:
H 0 : the linear correlation in the population is 0 (no linear association); versus
H a : the linear correlation in the population differs from 0.
The sample estimate of the population correlation coefficient is quite
large (0.84) but this may be due to the particular data analyzed (one
Search WWH ::




Custom Search