Environmental Engineering Reference
In-Depth Information
13.3.4 Population estimationmodels
13.3.5 Accuracy assessment
Accuracy for building detections was assessed at both pixel and
object level. At the pixel level, the overall accuracy (percent of
correctly classified pixels) and the kappa statistics were derived
from the standardized confusion matrix (Congalton and Green,
1999). At the object level, we calculated the detection rate, i.e.
the percentage of correctly detected buildings to total number of
reference buildings, and the commission error, i.e. percentage of
false detections to total number of detected buildings.
To assess land use classifications accuracy, both the refer-
ence and the classified land use labels were attached to each
building footprint. Then a confusion matrix was built by cross-
comparison of the reference and extracted land use labels. Finally,
we calculated the overall and per-class accuracy ratios and the
kappa coefficient of agreement from the confusion matrix.
The goodness of fit of population estimation models was
assessed through the coefficient of determination ( R 2 ), whereas
the model validation was based on statistics derived from the
absolute difference between census and estimated populations,
i.e. absolute error. The statistics included the mean, standard
deviation, median (50-percentile), maximum (100-percentile),
lower quartile (25-percentile), upper quartile (75-percentile),
and interquartile range (the difference upper and lower quar-
tiles) of the absolute error. Among these error measures, the
median absolute error (MAE) and the interquartile range (IQR)
were more extensively considered as they are most common in
population estimation studies. Relative errors were not consid-
ered due to sensitivity issues in areas of low population density,
or even indetermination in non-populated areas.
Seven linear models of population estimations at census blocks
level were tested in this study, with alternating explanatory
variables that incorporated different information of building
structure and land use types.
Table 13.1 summarizes the building statistics and land use
information incorporated by each model. The first six models
were constructed by combining three building statistics at block
level (building count, footprint area and total volume) with
two levels of land use information (Residential vs. SF and MF).
For example, Model 1 uses the per-block counts of residential
buildings (N) regardless of whether it is SF or MF, whereas
Model 2 uses the split of the count of residential building into
the count of SF buildings (N1) and the count of MF buildings
(N2). These two models are inspired in the previous housing
unit method (Watkins and Morrow, 1985; Smith and Cody,
2004), where the number of buildings replaces the number of
housing units. Likewise, Model 3 and Model 4 are inspired in
the broadly used area-based methods, but with a linear form.
We used the linear form of the area-population relationship
because a preliminary exploratory analysis confirmed that the
linear form fitted the data better than the allometric form.
Models 5 and 6 were proposed under the premise that building
volume can better describe the living space, and thus may allow
for more accurate population estimates. Regarding Model 7,
an optimal linear model was constructed by selecting a few
explanatory variables out of 16 variables. The set of initial
variables included building count, area, volume, perimeter, shape
and height, fromboth SF andMF buildings. The variable selection
strategy sought to include the smallest number of explanatory
variables whilemaintaining a high correlationwith the dependent
variable.
The coefficients for each of the seven models were esti-
mated through ordinary least squares procedure (Selvin,
1995). We also calculated normalized coefficients or path
coefficients, which correspond to the regression coefficients
multiplied by the standard deviation of the explanatory
variable divided by the standard deviation of the dependent
variable. The normalized forms are useful for inter-comparisons
as
13.4 Results
13.4.1 Building detection results
The results from each building detection method are illustrated
in Fig. 13.2. These error maps were built through comparing the
detection mask from each method with the reference building
footprint layer in raster format. Errors of omission and com-
mission are colored in blue and red for easy identification. In
addition, the overall per-pixel accuracy, the kappa statistics (Con-
galton, 1991), the detection rate, and the commission error were
they represent
the sensitivity of
the dependent vari-
able to the variation of
the independent variable (Selvin,
1995).
TABLE 13.1 Seven Population Estimation Model.
Models
Equation
Land use type
Adopted variables
1
P 1 = α 1 N + ε
Residential
Building counts
2
P 2 = α 1 N 1 + α 2 N 2 + ε
SF and MF
Building counts
3
P 3 = α 1 A
+ ε
Residential
Footprint area
4
P 4 = α 1 A 1 + α 2 A 2 + ε
SF and MF
Footprint area
5
P 5 = α 1 V
+ ε
Residential
Building total volume
6
P 6 = α 1 V 1 + α 2 V + ε 2
SF and MF
Building total volume
7
P 7 = α 1 A 1 + α 2 N 1 + α 3 V 1 + ε
SF and MF
Building counts, footprint area, Building volume
P stands for the population counts at Census block, N stands for building counts within Census blocks, N1 stands for single family building counts, N2 stands
for multi-family building counts, V stands for total building volumes, alpha stands for coefficients to be estimated from the models, epsilon stands for residual of
the model.
Search WWH ::




Custom Search