Geoscience Reference
In-Depth Information
introduced an additional independent variable which is the day for the given
position of the cyclone from the starting date of its genesis.
Let z i be an n i × 2 matrix of latitude and longitude measurements for cyclone
track i, let t i be an n i × 1 vector of corresponding discrete position indices {0,
1, . . . , n i × 1} and let d i be an n i × 1 vector of corresponding discrete day
indices. Here, n i is the number of latitude and longitude positions for the cyclone
track i . We model both longitude and latitude with a polynomial regression
model of order p ( p = 2), with t i and d i as the independent variable. Under the
assumption that track i was generated by cluster k , we have
z i = T i E k + D i D k + ° i , ° i , ~ N (0, 6 k )
Here T i is the n i × ( p +1) Vandermonde regression matrix associated with
the vector t i , defined as ( p + 1) columns corresponding to t i such that the
components of t i in the m th column are taken to the power of m for 0 m p ;
D i is the n i × 1 matrix associated with the vector d i . Here, E k is a ( p + 1) × 2
and D k is a 1 × 2 matrices of regression coefficients for cluster k , containing the
longitude coefficients in the first column and latitude coefficients in the second
column; and 6 i is an n i × 2 matrix of multivariate Gaussian noise, with zero
mean and a 2 × 2 covariance matrix 6 k . The covariance matrix contains diagonal
elements and, which are the noise variances for each longitude and latitude
observation, respectively. The cross covariance is set to zero for simplicity.
Further details of the regression mixture model definition and the estimation
algorithm are given by Gaffney (2004) and Camargo et al. (2007). Model
estimation was done using Package 'flexmix' (Gruen and Leisch, 2007) from
the R software (R Development Core Team, 2010).
5. Results
5.1 Regression Mixture Models
To obtain the most appropriate number of clusters while performing the
clustering using regression mixture model, log-likelihood values are used.
Figure 1 shows the observed log-likelihood values for different number of
clusters. The AIC and BIC values were also computed from the log-likelihood,
and are shown in Fig. 1. We see a diminishing improvement in fit beyond K =
5, suggesting a five cluster solution. We also examined the clusters for k-5, 6
and 7 cluster solutions and found that increasing the number of clusters beyond
five only lead to a subdivision of existing clusters, rather than the formation of
new clusters. It may be noted that in other ocean basins the number of clusters
found to be optimum ranges between three to six (Elsner, 2003; Camargo et
al., 2007; Nakamura et al., 2009).
The results for the five-cluster solution are shown in Fig. 2. Geographical
proximity of tracks, their orientation and the length of the tracks tends to drive
the clustering as can be seen from this figure. The corresponding mean
trajectories are also overlaid for the each of five clusters.
Search WWH ::




Custom Search