Information Technology Reference
In-Depth Information
This chapter is an illustrative tutorial that demonstrates how a statistical classification model
can be used to identify key drivers of NPS. Our premise is that the classification model, the
data it operates on, and the analyses it provides could usefully form components of a
Decision Support System that can not only provide both snapshot and longitudinal analyses
of NPS performance, but also enable analyses that can help suggest company initiatives
aimed toward lifting the NPS.
We assume that the NPS question was asked as part of larger survey that also probed
customer satisfaction levels with respect to various dimensions of the company's services.
We develop a predictive classification model for customer advocacy (promoter, passive or
detractor) as a function of these service dimensions. A novelty associated with our
classification model is the optional use of constraints on the parameter estimates to enforce a
monotonic property. We provide a detailed explanation of how to fit the model using the
SAS software package and show how the fitted model can be used to develop company
policies that have promise for improving the NPS. Our primary objective is to teach an
interested practitioner how to use customer survey data together with a statistical classifier
to identify key drivers of NPS. We present a case study that is based on a real-life data
collection and analysis project to illustrate the step-by-step process of building the linkage
between customer satisfaction data and NPS.
2. Logistic regression
In this section we provide a brief review of logistic and multinomial regression. Allen and
Rao (2000) is a good reference that contains more detail than we provide, and additionally
has example applications pertaining to customer satisfaction modeling.
2.1 Binomial logistic regression
The binomial logistic regression model assumes that the response variable is binary (0/1).
This could be the case, for example, if a customer is simply asked the question “Would you
recommend us to a friend?” Let
{ i Y denote the responses from n customers, assigning a
“1” for Yes and “0” for No. Suppose a number of other data items (covariates) are polled
from the customer on the same survey instrument. These items might measure the
satisfaction of the customer across a wide variety of service dimensions and might be
measured on a traditional Likert scale. We let
1
x denote the vector of covariates for the i -th
sampled customer and note that it reflects the use of dummy variable coding for covariates
that are categorical scale. For example, if the first covariate is measured on a 5-point Likert
scale, its value is encoded into
i
5
x by using five dummy variables
{ jj
x
, where
x  if
1
i
1
1
1
and only if the Likert response is j .
The binomial logistic regression model posits that Y is a Bernoulli random variable
(equivalently, a binomial random variable with trial size equal to one) with success
probability p , and further, that the success probability is tied to the covariates through the
so-called link function
is a vector of model
parameters (slopes). Continuing with the 5-point Likert scale example above, there would be
five slopes
p
exp(
 
x
) /[1
exp(
 
x
)]
, where 
i
i
i
5
5
{
jj
associated with the five dummy variables
{ jj
x
used to code the first
1
1
1
1
covariate.
Model fitting for the binomial logistic regression model entails estimating the parameters 
and 
via maximum likelihood. The likelihood function for this model is
Search WWH ::




Custom Search