Case Study on Multispectral Land Cover Classification - Open Source Geospatial Tools: Applications in Earth Observation

Geoscience Reference

In-Depth Information

case where an independent reference test set is already available, the choice of the

sampling design has already been made.

In Sect. 14.2.6 , we have seen how to generate a systematic grid of points in

Python. 8 As an alternative, we will use the method shown in Sect. 10.3 that is based

on the utility gdal2xyz.py .

We adapt the command from Sect. 10.3 in a number of ways. First, we exclude

pixels with a value of 0 from the accuracy assessment . These pixels are either outside

of the acquisition area of the sensor or have been identified as cloudy. There is no

option to ignore no-data values in the utility gdal2xyz.py , so we need to do it

in a different way. One option is to open the output file in a text editor or in spread

sheet program and do it manually. Here, we use an automatic method based on the

awk command in Bash.

gdal2xyz.py -skip 200 LC82070232013_fmap_masked.tif | awk -v

ₒ

OFS="," '{ if ($3>0) print $1,$2,"noforest","valid"}' >

ₒ

sample.csv

-skip 200

Downsampling factor: report every 200th row and column.

LC82070232013_fmap_masked.tif

Name of the input raster dataset.

| awk -v OFS=","

Pipe output of previous command to the Bash awk command.

-v OFS=","

Report in output in comma separated value (CSV) format.

'if($3>0) print $1,$2,“noforest”,“valid”'

Print first two columns if third column is (strict) positive. Add extra third and

fourth columns with the fixed text “noforest” and “valid”.

> sample.csv

Print output to a file named sample.csv (redirect output).

The output is formatted in CSV format in order to be recognized by OGR. The

extra field in column three will be used to describe the label of our sample as part of

the response design (Stehman and Czaplewski 1998). We expect most of the points

in a non-forested area. We therefore set the initial description as "noforest" .

This will save us typing when revisiting the points as we only have to re-label points

within a forest area. The extra field in column four will be used for comments in the

labeling process.

Based on the CSV file, we create a virtual OGR vector, sample.vrt ,usingthe

same approach as described in Sect. 2.6 :

8 The utility pkextract contains a similar feature (see Sect. 12.3 ) .

Search WWH ::

Custom Search

Home