Identification of Key Drivers of Net Promoter Score Using a Statistical Classification Model - Efficient Decision Support Systems: Practice and Challenges from Current to Future

Information Technology Reference

In-Depth Information

This chapter is an illustrative tutorial that demonstrates how a statistical classification model

can be used to identify key drivers of NPS. Our premise is that the classification model, the

data it operates on, and the analyses it provides could usefully form components of a

Decision Support System that can not only provide both snapshot and longitudinal analyses

of NPS performance, but also enable analyses that can help suggest company initiatives

aimed toward lifting the NPS.

We assume that the NPS question was asked as part of larger survey that also probed

customer satisfaction levels with respect to various dimensions of the company's services.

We develop a predictive classification model for customer advocacy (promoter, passive or

detractor) as a function of these service dimensions. A novelty associated with our

classification model is the optional use of constraints on the parameter estimates to enforce a

monotonic property. We provide a detailed explanation of how to fit the model using the

SAS software package and show how the fitted model can be used to develop company

policies that have promise for improving the NPS. Our primary objective is to teach an

interested practitioner how to use customer survey data together with a statistical classifier

to identify key drivers of NPS. We present a case study that is based on a real-life data

collection and analysis project to illustrate the step-by-step process of building the linkage

between customer satisfaction data and NPS.

2. Logistic regression

In this section we provide a brief review of logistic and multinomial regression. Allen and

Rao (2000) is a good reference that contains more detail than we provide, and additionally

has example applications pertaining to customer satisfaction modeling.

2.1 Binomial logistic regression

The binomial logistic regression model assumes that the response variable is binary (0/1).

This could be the case, for example, if a customer is simply asked the question “Would you

recommend us to a friend?” Let

{ i Y  denote the responses from n customers, assigning a

“1” for Yes and “0” for No. Suppose a number of other data items (covariates) are polled

from the customer on the same survey instrument. These items might measure the

satisfaction of the customer across a wide variety of service dimensions and might be

measured on a traditional Likert scale. We let

x  denote the vector of covariates for the i -th

sampled customer and note that it reflects the use of dummy variable coding for covariates

that are categorical scale. For example, if the first covariate is measured on a 5-point Likert

scale, its value is encoded into

x  by using five dummy variables

{ jj

, where

x  if



and only if the Likert response is j .

The binomial logistic regression model posits that Y is a Bernoulli random variable

(equivalently, a binomial random variable with trial size equal to one) with success

probability p , and further, that the success probability is tied to the covariates through the

so-called link function



is a vector of model

parameters (slopes). Continuing with the 5-point Likert scale example above, there would be

five slopes



exp(

 



) /[1



exp(

 



)]

, where 



{



associated with the five dummy variables

{ jj

used to code the first



covariate.

Model fitting for the binomial logistic regression model entails estimating the parameters 

and 



via maximum likelihood. The likelihood function for this model is

Efficient Decision Support Systems: Practice and Challenges from Current to Future

Search WWH ::

Custom Search

Home