Travel Reference
In-Depth Information
2 Likert scales are ordinally scaled. The distance between answer options is not defi ned, as is
the case with metric data. Most segmentation algorithms are based on distance computations,
but distance cannot easily be measured at ordinal level.
One way of avoiding both problems above is to use full binary answer formats (where respon-
dents are asked to answer with a Yes or a No). The full binary answer format does not capture
response styles and distance can easily be computed. This approach has been recommended
a long time ago by Cronbach (1950) because the use of binary or dichotomous scales addresses
the problem of response styles at its very source and does not give respondents the opportu-
nity to display them. Cronbach's suggestion has not been taken up, instead multi-category
answer formats, mostly the fi ve and seven point Likert scale, still dominate survey research
in tourism.
Data-driven segmentation step #3: Forming of segments
After the data is collected a segmentation algorithm is used to identify or create market segments
based on the segmentation base. A number of critical decisions are made during this step.
First, it has to be assessed whether the exercise of forming segments is likely to reveal natural,
reproducible or constructed segments. The implication is that where natural segments exist, the
aim of the analysis is to identify the true segmentation solution. If, however, and this is the
trickiest case, clusters cannot be reproduced when the analysis is repeated and thus constructed
segments will be formed, the responsibility of the data analysts shifts to presenting a range of
interesting solutions to management and letting management choose which is most strategically
useful to them.
Second, and related to the fi rst point: a decision about the number of clusters needs to be
made (this is also true for the case of hierarchical clustering although the dendrogram may offer
some guidance). Obviously, the number of clusters will hugely infl uence the fi nal segmentation
result. If, for example, two segments are chosen, it is likely that one will simply contain respondents
who tended to say Yes to questions and another will contain respondents who tend to say No.
Typically, this is not very informative for management. If, however, the same respondents are
grouped into a larger number of clusters, more distinct patterns will start to emerge.
One way that can help resolve both issues discussed above is to simply repeat the segmentation
analysis multiple times for a range of numbers of clusters (for example, ten calculations with four
segments, ten calculations with fi ve segments . . . and ten calculations with ten segments) and
compare the resulting segments. This procedure has been proposed and illustrated by Dolnicar
and Leisch (2010) using R code which runs the repeat analysis automatically, but can be
reproduced with other statistical packages. If the exact same segments emerge from repeated
computations it can be assumed that natural segments exist, if similar segments emerge the
segments are likely to be reproducible, and if segments are different every single time, then
segments need to be constructed artifi cially. In terms of decisions on the number of clusters
chosen, the number of clusters which leads to most stable results (meaning that similar segments
result from repeated computations) is preferable.
Other, less critical decisions at this stage include the choice of algorithm. Some algorithms
have known tendencies of creating clusters of certain shapes, but our research has shown over
the years that the algorithm is only critical in the case of constructive clustering (Buchta et al .
1997). If there is suffi cient structure in the data, most algorithms will lead to similar solutions.
Another decision is the choice of distance measure which needs to be suitable for the scale of
the data.
Search WWH ::




Custom Search