PE&RS March 2016 full version - page 202

images were compared to the same image derivatives generated
directly from the median composite by extracting 1,000 ran-
dom values from each image and running linear regressions.
Modeling Percent Canopy Cover
There were 677, 1,317, 568, 681, and 523
FIA
plot locations,
which reprwsent 20 percent of the total
FIA
plots, used to
collect the
PTCC
response data for zones 16, 23, 48, 54, and
59, respectively. A circle with a radius of 43.9 m was placed
over each
FIA
plot center. Each circle contained a 109-dot
grid, which was oriented 15 degrees east of true north, with
each dot separated by 8 m. Using a 1-meter spatial resolution
digital aerial photography, photo-interpreters evaluated each
dot as being either tree or not tree. For each plot,
PTCC
was
calculated from these dot counts.
For each of the L5 datasets (model II regression mosaic,
median composite, and maximum
NDVI
composite),
NDMI
,
NDVI
, and Tassel Cap images were created to use as explana-
tory variables in the
PTCC
modeling. A 30 m digital elevation
model (
DEM
), Bailey’s ecoregions (Bailey, 1983),
NLCD
2001
land cover,
NLCD
2001
PTCC
, and pixel coordinates were also
used as explanatory variables. Aspect, cosine, and sine of
transformed aspect, and slope in degrees were generated from
the
DEM
. Standard deviations were calculated using a 3 by 3
focal window for all the explanatory images except for Bai-
ley’s ecoregions,
NLCD
2001 land cover, and the pixel coordi-
nates. The spatial resolution of all these datasets was 30 m.
Because the spatial resolution of the response data was
approximately 90 m, focal means using 3 by 3 windows were
created for all of the 30 m explanatory variables except for
Bailey’s ecoregions,
NLCD
2001 land cover, and the pixel co-
ordinates. For Bailey’s ecoregions and
NLCD
2001 land cover,
focal majority images using 3 by 3 windows were created.
The model algorithm used was random forest as imple-
mented in R 3.02 (Liaw and Wiener, 2002; R Core Team,
2013). The focal means, majorities, and pixel coordinate data-
sets along with all of the response data were used to train the
random forest model. The random forest model was applied
to the original 30 m datasets.
The number of trees used in the random forest algorithm
was 500, which means for every pixel, 500
PTCC
predictions
were generated. The final
PTCC
estimate for each pixel was the
mean of these 500 predictions. Standard errors for each pixel
were derived from these 500 predictions (Zar, 1996). Using
the standard errors, 95 percent confidence limits were created
for each pixel. These 95 percent confidence limits were used
to compare the model predictions derived for each of the
three composite types. To provide information on the land-
cover types where significant differences were occurring, the
pixels that were significantly different were intersected with
the
NLCD
2011 land-cover image.
As part of the random forest algorithm, approximately 37
percent of the reference data is withheld from developing the
models (Breiman, 1996; Breiman, 2001). These out-of-bag (
OOB
)
data can be used as assessments of model performance (Brei-
man 2001; Adelabu
et al.
, 2015). The two
OOB
metrics used in
this study to compare the models were percent variance ex-
plained and mean of squared residuals (see Equations 2 and 3).
The higher the percent variance explained and the lower the
mean of squared residuals, the better the model performance.
PercentVariance Explained
Observed Predicted
Obser
= −
(
)
1
2
ved Observed
(
)
2
(2)
Meanof Squared Residuals
Observed Predicted
Number of
=
(
)
2
Samples
(3)
Greenfield
et al.
(2009) conducted an accuracy assessment
of the
NLCD
2001
PTCC
using Wilcoxon signed rank tests. Wil-
coxon signed rank tests (α = 0.05) were used in this study to
examine if differences in the
OOB
metrics were significant.
Results
There were 765 L5 scenes reprojected and converted to surface
reflectance according to the methods described above. Thirty
percent of the L5 scenes were from 2011; 37 percent were from
2010; 29 percent were from 2009; 3 percent and 1 percent
were from 2008 and 2007, respectively. Using these images,
median composite images and maximum
NDVI
composite im-
ages were created for 51 paths/rows. Two model II regression
mosaics, one for the western zones (16 and 23) and one for the
eastern zones (48, 54, and 59), had been created previously as
part of the prototype study (Coulston
et al.
, 2012).
Median Composite
Because of clouds and shadows, not all pixels were available
to use for the median calculation. For the western zones (16
and 23), a mean of 14 pixel values with a standard deviation
of 1 were available for the median calculation. For two of the
eastern zones (48 and 54), a mean of 13 pixel values with a
standard deviation of 1 were available for the median calcula-
tion. Zone 59 had a mean of 11 pixel values with a standard
deviation of 2 that were available for the median calculation.
The median composite images do not preserve the integ-
rity of the bands. Tables 1 and 2 shows the dates chosen and
the percent of pixels used for each date for paths/rows 36/33
(Table 1), which lies in the center of zone 23, and 20/36 (Table
2), which lies in the center of zone 48. The abundance of data
prevents showing the information for all paths/rows, but these
are typical examples of the variety of dates and the percent-
age of pixels that were used for each band. Unless otherwise
stated, these two paths/rows will be used for all examples.
For 36/33, between 8 and 15 pixel values were available
for the median calculation with a mean 14 and standard
deviation of 1. For 20/36, between 1 and 15 pixel values were
available for the median calculation with a mean of 12 and a
standard deviation of 1.
While pixels from 09 Aug 2011 for bands 1 to 3 were used
most often in the median composite for 36/33, this date was
the sixth, second, and third most commonly used date for
bands 4, 5, and 7, respectively. For band 5, the most common-
ly used date was 19 August 2009, which differs by a couple of
years from the primary dates for the other bands. For 20/36,
the most commonly used dates were more mixed and vari-
able. For bands 1 and 3, the majority of pixels were from 04
Oct 2008. For bands 4 and 5, the majority of pixels were from
16 June 2009. For band 2, the majority of pixels were from 31
May 2009. For band 7, the majority of pixels were from 6 Jun
2011. The primary dates for all bands came from a variety of
years: 2008, 2009, and 2011. For both of these paths/rows, the
primary dates chosen were used for a maximum of 22 percent
of the pixels and a minimum of 9 percent of the pixels. For
all 51 paths/rows, the average maximum percentage that a
particular date was used for a band was 15 percent with a
standard deviation of 7 percent.
Also examined was how often were the dates for a band in
agreement with dates for other bands for a given pixel. The
results for 36/33 and 20/36 are shown in Table 3. For 36/33,
bands 2 and 3 had the highest percentage (15 percent) of pix-
els that agreed with the dates used for the median composite.
The bands with the least agreement were bands 3 and 4 with
a percentage of 6 percent. For 20/36, bands 5 and 6 had the
highest percentage of agreement (19 percent) while bands 1
and 4, 2 and 4, and 3 and 4 had the lowest percentage of agree-
ment (9 percent). For all paths/rows, the average percentage
of pixels that were in agreement regarding dates used for the
202
March 2016
PHOTOGRAMMETRIC ENGINEERING & REMOTE SENSING
167...,192,193,194,195,196,197,198,199,200,201 203,204,205,206,207,208,209,210,211,212,...234
Powered by FlippingBook