Travel Reference
In-Depth Information
probabilistic models such as Latent Class Analysis. Generally, the technique selected depends on
the classifi cation goals, the metric properties of the underlying variables and on the similarity or
density measure (Aldenderfer and Blashfi eld 1984). In tourism marketing hierarchical and
partitioning segmentations are most common (Dolnicar 2002).
The decision regarding the similarity measure (i.e. correlation coeffi cients, distance measures,
association coeffi cients and probabilistic similarity measures) is imperative, but the algorithm
for creating clusters usually attracts greater attention (Aldenderfer and Blashfi eld 1984). With
respect to hierarchical cluster analysis the most prominent algorithms are single linkage (Sneath
1957), complete linkage (Sokal and Michener 1958) and Ward's method (Ward 1963). In tourism
about half of all clustering studies are hierarchical; about one third of these fail to specify the
linkage algorithm; Ward's method is used most often followed by complete linkage (Mazanec
et al . 2010). Tkaczynski et al . (2010) suggest a two-step analysis where the hierarchical clustering
is preceded by a grouping of cases into pre-clusters. As with hierarchical cluster analysis heuris-
tics for partitioning methods are available for choosing the number of clusters or seed points,
types of pass (i.e. ways in which cases are allocated to groups) and statistical criteria (for
determining how to compute homogeneity). The most prominent are k-means and hill climbing
pass. Partitioning methods cannot guarantee globally optimal solutions. Non-hierarchical
procedures are favoured with binary data and large data sets (Hair et al ., 1998). Tourism studies
also apply the so called two-stage clustering procedure where the hierarchical primer is used to
defi ne the number of clusters; then k-means is applied for generating a partition (Punj and
Stewart 1983).
Segmentation base and data preprocessing
For all a posteriori segmentation methods a base of variables has to be defi ned. In tourism mar-
keting clustering is most often based on motives or needs, activities, followed by benefi ts sought
and attitudes. Regrettably, there are also segmentation studies which do base the classifi cation on
a seemingly arbitrary mixture of different variables rather than on one well defi ned behavioural
concept (for more elaborate criticism see Mazanec et al . 2010). Tourism researchers often choose
large numbers of variables to capture the scope of a concept, but manage to collect only small
numbers of survey respondents which leads to methodological problems (Dolnicar and Leisch
2010). To overcome this problem many researchers condense data before the actual segmentation
by applying Correspondence Analysis (e.g. Arimond and Elfessi 2001), Conjoint Analysis (e.g.
Sedmak and Michalic 2008) and most often PCA (e.g. Decrop and Zidda 2006). These methods
are also used to improve the level of scaling. Within tourism research, pre-processing measures as
well as data standardization are heavily criticized (Dolnicar and GrĂ¼n 2008). Sheppard (1996)
demonstrates the differences between using raw data and factor scores by means of artifi cial data.
He shows consequences for the segment structure, variation of dimensionality across segments
and problems with items that would be discarded due to prior factoring. In spite of these fi nd-
ings, there are still publications featuring factor-cluster analysis in top tourism journals (e.g.
Suni and Komppula 2012). Recently, improvements in handling cluster procedure drawbacks
have found entrance into tourism marketing and authors also propose statistically sound ways to
overcome data dimensionality problems.
Recent developments
When dealing with high-dimensional data Dolnicar et al . (2012) suggest either collecting
large samples, including only most managerially relevant items based on a series of pretests, or
Search WWH ::




Custom Search