Geoscience Reference
In-Depth Information
In our analyses, we also take into account an influential point, which is an
observation that greatly affects the slope of the regression line. Observations
can be flagged as potential influential points by means of leverage points,
DFFITS and Cook's distance. The cut off point for leverage in the above three
models is 0.0487, but note that a leverage point is not always an influential
point. DFFITS is a diagnostic meant to show how influential a point is in a
statistical regression (Belsley et al., 1980). It is defined as the change (“DFFIT”),
in the predicted value for a point, obtained when that point is left out of the
regression, “Studentized” by dividing by the estimated standard deviation of
the fit at that point:
yy
sh
ˆ
ˆ
i
i ii
()
DFFITS =
i
ii
where and are the prediction for point i with and without point i
included in the regression, s ( i ) is the standard error estimated without the point
in question, and h ii is the leverage for the point. Large values of DFFITS indicate
influential observations. An observation with DFFITS value greater than 0.453
is flagged for scrutiny. As for Cook's distance, an observation is also flagged if
the value is greater than 1.
In our analyses, the following observations were flagged as potential
influential points:
Model 1: observations 1, 20 and 29,
Model 2: observations 1 and 29, and
Model 3: observations 1 and 20.
Model 1 was then adjusted and fitted three times, each time omitting one
flagged observation and the same was done with Models 2 and 3 by excluding
observations 20 and 29 alternatively. The models with the highest adjusted R 2
values were kept separately from Models 1, 2 and 3. In order to detect if any
multi-collinearity problems are present in the above models, VIF (Variance
inflation factor) was also calculated. The results (not shown) demonstrated
that there were no multi-collinearity problems in the proposed multiple
regression models. Table 3 shows the modified model fitting results.
The first observation, five TCs in the 1969-70 season, was found to be the
most influential point for all three models. All three indices along with the
temporal trend explained 44% to 50% of the total variation in the annual number
of TCs in the Australian region. After removing the influential point, one can
see (Table 3) that all three modified regression models perform better in terms
of improved adjusted R 2 values. The temporal trend effect also became
significant ( P -values < 0.05) after removing the first observation, making the
temporal trend an essential predictor for the annual number of TCs in Australian
region.
One of the possible reasons for TC downward trend in the Australian region
could be due to ENSO impact on TC geographical distribution. In the Australian
Search WWH ::




Custom Search