PE&RS August 2014 - page 761

For the single
HNB
band analysis, Pearson R correlations were
made with each 10 nm
HNB
, showing both the strength and
direction of the relationship with biomass. The correlations, as
well as other statistical procedures discussed in the remainder
of the paper were performed in R. Plots were made for each
transformed dataset (untransformed, first derivative, and sec-
ond derivative) and for every crop type. Initially, the analysis
was performed for one mega dataset that included all crop
types, but the results of the analysis were poorer than per crop
type, so no further analysis was performed on the mega dataset.
Further processing was performed on the second deriva-
tive transformed data, as the correlations between the second
derivative transformed spectra and biomass were highly vari-
able. In order to smooth out inconsistencies between second
derivative transformed spectra, the absolute values of
HNB
re-
flectance were integrated over important inflection points per
Elvidge and Chen (1995). Various integrating windows were
evaluated (25 to 300 nm at 25 nm intervals). The correlations
were consistently lower than with the non-integrated second
derivative transformed spectra and were therefore not includ-
ed in model building.
For the two-band
NB
band analysis, Lambda-lambda (R
2
)
contour plots for each transformation and crop type were cre-
ated by correlating every possible two-band combination of
the 196 bands with biomass. Analogous to
NDVI
the vegetation
indices were calculated as follows:
HVI
R R
R R
= −
+
2
1
2
1
(1)
where
HVI
is the two-band
HVI
, R
1
is the untransformed or
transformed reflectance band one, and R
2
is the untrans-
formed or transformed reflectance band two.
Multiple Band
HVI
Using Sequential Search Methods
Sequential search methods, like stepwise and forward addition
(
FA
) regression, iteratively select the best predictor and then
incrementally include predictors successively having lower,
but significant partial correlation with the residuals (Hair, Jr.
et al
., 1998). In this study, we used
FA
, because the number
of predictors exceeded the number of samples for each crop
type. A subset of the data used to calibrate two-band
HVI
s was
used for
SSM
, because
SSM
requires that missing wavelengths
are consistent across all samples. A common rule of thumb
in statistics to prevent over-fitting is to build a model with
less than 20 × (number of predictors); this criterion was used
during the model building process. Several methods for testing
the significance of the partial correlations exist. We used the
Akaike Information Criterion (
AIC
), which is a function of both
the maximized log-likelihood and number of predictors at
each incremental stage of model-building. Lower
AIC
values,
therefore, not only indicate better model fit, but greater model
parsimony as well. Each step was scrutinized and a final
model was selected that had low
AIC
, significant predictors at
the 99.9 percent confidence band, and explained at least an
additional two percent of the biomass variance (
Δ
R
2
0.02).
Multiple Band
HVI
Using Principal Components Regression
Principal component regression uses component scores de-
rived from a Principal Component Analysis (
PCA
) as predictor
variables (Rodarmel and Shan, 2002). The purpose of
PCA
is to
transform large collinear datasets to reduce interdependencies
and yield factors (linear combinations) of input variables that
explain unique proportions of the total variance. This is car-
ried out in one step, which makes it contrary to
SSMs
, which
analyze predictors individually and may remove important,
but collinear predictors. The first component explains the
most variance, while subsequent components explain succes-
sively less variance. The smallest components typically ex-
plain the error variance. The transformation is performed by
the eigenvalue decomposition of the correlation or covariance
matrix. Transformations on the correlation matrix are typical-
ly performed to standardize the data. The linear transforma-
tion is defined as follows:
y
k
(
i
)
=
w
k
·
x
i
where
y
k
(
i
)
is the score for sample
i
,
w
k
is the loading for vari-
able
k
, and
x
i
is the sample value at variable
k
. The square of
the loading, as the name implies, is the contribution of the
variable to the factor total variance. The scores are the values
of the component for each sample and are used to determine
the regression coefficients in single or multiple
PCR
analysis.
The
PCA
was performed on a subset of the wavelengths and
biomass was then regressed against the scores. Principal Com-
ponents Analysis, like
SSM
, requires that each sample has the
same missing wavelengths, so some of the samples with miss-
ing wavelengths inconsistent with the full calibration dataset
used to develop the two-band
HVI
s could not be used. The data
was mean centered and the transformation was performed on
the covariance matrix, because the reflectance ranged from 0 to
1 at all wavelengths. Orthogonal rotations maximize variable
loading on a single factor, which makes predictor loadings
more interpretable, so orthogonal (VARIMAX) transformations
were analyzed alongside the unrotated data. Oblique trans-
formations were not used, because the procedures are less
developed and subject to debate (Hair Jr.
et al
., 1998). Scree
plots, which show the contribution of each component to the
total variance, along with correlations and significance tests
(<0.001) from the
PCR
, were used to determine the number of
components to include in the final
PCR
models.
Model Validation
Validation data for each model category was handled differ-
ently, however all approaches yielded scatterplots of actual
versus predicted biomass and summary statistics. In order to
validate the two-band
HVI
approach, the
HVI
that ranked first
in the calibration subset was used for model validation. Many
of the two-band
HVI
s, however had similar correlations and
could have been used alternatively. The
SSM
approach was the
most straightforward: the equations derived from the calibra-
tion dataset parameterized the validation
HNB
s. For
PCR
, the
loadings were multiplied by the validation
HNB
s and summed
to yield a score across components, which was then used to
create a linear model with log-transformed biomass.
Results
Single HNB Relationship
Pearson correlations (R) reveal the strength and direction of
the
HNB
-biomass relationships when all the crops are con-
sidered and for crop type subsets. Generally, the patterns of
the single
HNB
analysis for the crop type subsets shown in
Figure 2 are consistent with the “all crops” results; however,
the correlations are lower for the latter (maximum R < |0.4|).
Because of the poor results of the all crops analysis, no further
analyses are performed on these data. The number of samples
(
N
) used to develop the correlation plots by crop type are 60,
80, 104, and 106 for rice, alfalfa, cotton, and maize, respec-
tively. Figure 2 also includes the maximum and minimum
Pearson correlation and the wavelength centroid at which
it occurs. Pearson correlations per crop type are highest in
the visible and
NIR
, near the red-edge. Correlations between
log-transformed biomass and reflectance outside the red-edge
region are also high across much of the visible and
NIR
, while
relatively lower correlations exist in the
SWIR
. Rice, alfalfa,
and cotton generally show high positive correlations with
PHOTOGRAMMETRIC ENGINEERING & REMOTE SENSING
August 2014
761
691...,751,752,753,754,755,756,757,758,759,760 762,763,764,765,766,767,768,769,770,771,...814
Powered by FlippingBook