Environmental Engineering Reference
In-Depth Information
over an extended period prior to estimation. Upon collection of the necessary data,
explained above, the three functions would need to be estimated via regression
analysis. It is important to note that time periods in which no maintenance is
necessary should be recorded as having a zero maintenance cost, whereas periods
without failures should be omitted entirely from the dataset. This is because of the
certain nature of maintenance and uncertain nature of failure. It should also be noted
that certain types of assets would have no maintenance, such as pipes. In this case,
the maintenance parameter drops out of Eq. 7.8 entirely.
These functions, M(x), S(x), and F(x), can be estimated via standard Ordinary
Least Squares regression techniques. Upon completion of the regression, a critical
investigation into the violations of the Gaussian assumptions must be conducted,
and any violations must be addressed. Section 6 provides case studies of three
municipalities in British Columbia. In the case studies, three violations of the
Gaussian assumptions were detected: multicollinearity, heteroscedasticity and non-
normality. These violations may arise with similar datasets in other geographical
areas and are therefore given special attention here.
Multicollinearity is a relatively minor problem for two reasons. If multicollinearity
exists, OLS is still
cients are not
biased by this violation. The other reason that multicollinearity is a minor problem in
this application is that the multicollinearity often occurs between slope and intercept
dummies that communicate the same information. There are three ways to address this
violation. The user may want to return to the theory and determine if there is a sound
theoretical reason for including only one of the variables. 1
Heteroscedasticity is a violation that must be and is addressed more directly. 2
While this exploration centers around linear regression techniques, it is possible,
and even likely, that the data follow a nonlinear pattern. We can check this by
conducting nonparametric regressions. This technique determines a coef
best linear unbiased estimator
(BLUE)
the coef
cient for
each observation. Since the coef
cient represents the slope of the regression line,
different values for different observations indicate nonlinearity in the model. This is
best observed graphically by plotting the cumulative value of the coef
cient at each
observation. The cumulative value is plotted since the value at any given obser-
vation would be the sum of slopes at that observation and all prior ones. In the event
1 An alternative to this approach is to conduct regressions that exclude one variable and then the
other to determine if these variables are significant in the absence of the other with which it has a
high correlation. If they are not significant in the absence of multicollinearity, then there may be
statistical grounds for excluding these variables. If there are no theoretical or statistical grounds for
removing any of the variables, then these variables can still be included in the model, because, as
identified above, OLS is unbiased by multicollinearity.
2 In the case studies that follow, this violation is addressed in two ways. One can compensate for
heteroscedasticity by utilizing White
s covariance matrix in the regression. This technique adjusts
the standard error on the coef cient to help determine which variables are in fact signi cant.
Another approach to addressing this violation is to conduct a Robust Least Absolute Error
regression. This method is preferred if there are other violations, such as non-normality. However,
no goodness of fit statistics are provided with this technique. Therefore, the Robust LAE regression
should only be performed if the third violation identi ed above, non-normality, is present.
'
Search WWH ::




Custom Search