![]() |
![]()
advertisement
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|
PE&RS June 2001VOLUME 67, NUMBER 6PHOTOGRAMMETRIC ENGINEERING & REMOTE SENSING JOURNAL OF THE AMERICAN SOCIETY FOR PHOTOGRAMMETRY AND REMOTE SENSING Highlight Article Introduction In late 2000, the U.S.Geological Survey (USGS) EROS Data Center (EDC) completed the circa 1992 National Land Cover Data set (NLCD). The NLCD, derived from early 1990s Landsat Thematic Mapper (TM) imagery and other sources of digital data, represents an intermediate-scale national land cover data set. The resolution of this data set lends itself to many regional to national scale investigations, including analyses of water quality, ecosystem health, wildlife habitat, land cover assessments and other land management issues. The purpose of this paper is to describe the characteristics and uses of this data set. One of the goals in the development of the NLCD was to generate a reasonably consistent and seamless 30 meter product for the conterminous United States. The methodology employed to develop the NLCD is analogous to the database approach originally envisioned by Lauer (1986). The early developmental stages of the data set have been described elsewhere (Vogelmann et al. 1998a and b). In addition to the spectral information provided by the TM, ancillary spatial data were used to improve classification results, when appropriate. The NLCD may be considered as a replacement/update to the intermediate-scale Land Use and Land Cover data set (USGS, 1990) developed in the 1970s and early 1980s. Materials & Methods Landsat TM data used to develop the NLCD were terrain-corrected using 3-arc-second digital terrain elevation data (DTED; U.S. Geological Survey, 1993). The TM data were geo-registered to the Albers Equal Area projection grid using ground control points, resulting in a root mean square registration error of less than one pixel (30 m). Two or more TM data sets for each path/row, representing different seasons, were used to generate the NLCD product. Two or more TM scenes representing different times of the growing season (e.g. leaf-on and leaf-off) generally improves upon the quality of land cover information that can be derived as compared to analysis of a single scene. Ancillary Spatial Data. In addition to TM data, a variety of other intermediate-scale spatial data were used to help develop the NLCD; these included Digital Terrain Elevation Data (DTED) and derivative DTED products (slope, aspect, shaded relief), population density data at the census block level (Bureau of the Census, 1991 a and b; 1992), Land Use and Land Cover data (USGS 1990), and digital National Wetlands Inventory data (NWI; Fish and Wildlife Service, 1996). Other data sets were used to a lesser extent, and included available water capacity and organic carbon (0-40 cm depth) data derived from the State Soil Geographic (STATSGO) Data Base (U.S. Department of Agriculture, 1994), and land cover information derived from various state or national programs. Land cover data from the USGS Biological Resource Division Gap Analysis Program (Scott et al., 1996) were used when available.
General Classification Methods. Mosaics of leaf-on
(i.e. summer) and leaf-off (i.e. spring) imagery were generated for
each of 31 regional units (based on political administrative units,
contiguity of Landsat scenes and data volume) covering the conterminous
United States (Figure 1). Each unit was classified separately using
one of several methods. In most cases the general thematic approach
was to designate either mosaic (leaf-on or the leaf-off) as the “baseline” data
set, and to use that spectral information as the primary source of
information from which to derive the classification product. The decision
as to which mosaic to use was based on a subjective evaluation of which
appeared to be the “best” mosaic in terms of overall data quality and
information content. Leaf-off mosaics were chosen as baseline data
sets more frequently than leaf-on mosaics. Ancillary spatial information,
including spectral data from the image mosaic representing the other
season, was used to refine or aid in the labeling process. After development
of a first order classification product, a series of recoding operations
were performed to fix obvious misclassifications and to further refine
the classification. During the initial stages of the project, the primary steps for generating the NLCD classification product were: (1) Cluster the baseline mosaic using unsupervised classification. (2) Interpret and label clusters using aerial photographs. (3) Resolve confused clusters by constructing logical or threshold models that utilize appropriate ancillary data. (4) Develop and incorporate information from onscreen digitizing (e.g. quarries, transitional bare areas). As the project progressed, some classification teams individualized the approach on a case-by-case basis. More in-depth methodology for this process has been described elsewhere (Vogelmann et al., 1998 b). Modifications were based upon data quality issues, characteristics of the region being analyzed, and familiarity with other approaches that would facilitate and/or more readily automate the classification process. In the case of multiscene mosaics, spectral clusters developed from unsupervised clustering can be very complex, and a single cluster may represent many different types of land cover. In these cases, splitting the clusters into meaningful land cover units via modeling based on spectral and ancillary data can be quite difficult. Not only are the thresholds used to make land cover separation important, but the order of threshold implementation can also have substantial effects on the land cover estimates. Determining the optimal set of thresholds and the optimal order of implementation for complex “confused clusters” can be time consuming and difficult. One approach taken was to use decision trees to facilitate the modeling process. Decision trees derive objective, efficient, and ordered thresholds using non-parametric techniques (Friedl and Brodley 1997 and Hansen et al. 1996). Decision trees have been successfully used to derive land cover (Friedl and Brodley 1997) and to identify important data layers or spectral bands (Hansen et al. 1996; Prince and Steininger 1999). For decision tree training, the image clusters were included only as an ancillary data set. Multiple decision trees trained with different data layer set combinations and/or decision tree parameter options produced multiple land cover maps. Majority land cover from the multiple land cover maps for each pixel was used as a preliminary land cover map. Decision trees were also used to help define classification “rules” in an expert system classifier (ERDAS 1998). The expert system allowed quick identification of which rules affected which pixels and quick modification of either rules or rule confidence levels. While it was apparent that the decision trees maintained a high degree of detail in the land cover products, the need for visual inspection and “heads up” on-screen corrections of the preliminary land cover persisted. In all cases, results were scrutinized in order to ensure comparability of land cover data among regions and to correct obvious classification errors. Edge-matching of adjacent mosaics was then performed to yield a reasonably seamless national-scale product. This was a major task due to seasonal and interpretation differences that invariably resulted in thematic seam lines at the boundaries of the mosaics. As the mosaics were finally pieced together, individual states were extracted using boundaries defined by the 1:100,000 scale Digital Line Graph series. The state files were designated “preliminary” and made available to the user community for review and feedback. Initially, the state land cover files were available to users who contacted the USGS/EDC Land Cover Applications Center to gain access to the data. The intent was to ensure that users understood the “preliminary” nature of the data, as well as its limitations, and to register the user and solicit feedback regarding the utility of the data and any problems that were identified. As feedback was received, it was reviewed and a determination made if an update was required. In many cases, updates were made, and the registered users were apprised of the changes. Accuracy Assessment Methods. When all the states comprising a Federal Region were finalized, the accuracy assessment (AA) phase was initiated. At this time, the AA for regions 1-4 (Figure 2) are complete, regions 5,7, and 10 are underway, and the remainder are in the planning stages. The accuracy assessment was based on interpretations of 1990 vintage aerial photographs acquired by the National Aerial Photography Program (NAPP). The accuracy assessment of NLCD was achieved with 1) a probability sampling design;The sampling design incorporated three levels of stratification and a two-stage cluster sampling protocol (Stehman et al., 2000). Each Federal region constituted a stratum and was sampled independently. Within each mapping region, geographic strata were created using 15 x 15 or 30 x 30 minute grid cells, depending on the size of the region. Primary sampling units (PSU) defined by non-overlapping, interior regions of NAPP were delineated within these strata. A single PSU was randomly selected from each grid cell, with all PSU’s having an equal probability of being selected. All pixels selected within the first-stage PSU’s were stratified by mapped landcover class, and a simple random sample of approximately 80 to 100 pixels was selected for each land-cover class. To obtain reference land cover class labels, each sample (pixel) was identified on a NAPP aerial photograph. A suite of attributes was collected by photointerpreters, including primary and alternate landcover label (an alternate reference label only provided when appropriate), landcover heterogeneity in the vicinity of the sample unit, and a confidence rating of the photointerpreted landcover label. For a more detailed discussion on the reference data collection and evaluation, refer to Yang et al. (2000) and Zhu et al. (2000). For each mapping region, stratified sampling formulas were applied to estimate the error matrix cell proportions (Stehman and Czaplewski, 1998), and subsequently, the estimates of overall and class-specific user and producer’s accuracy (Story and Congalton, 1986). The use of stratified formulas is important because of sampling methods that have been chosen for the project. Accuracy results were computed through weighting the cell proportions by the proportion of each land cover within a given Federal region. Results and Discussion
Land cover for larger scale areas located in Colorado (Figure 4) provide information
regard-ing the level of detail typical of the NLCD, with the image to the right
showing full resolution and detail of the data set.
The equal-area projection of the NLCD allows easy area tabulations of the
various land cover classes. Table 2 shows the percentage land cover estimates
derived from raw pixel count for each of the ten U.S. Federal Regions of the
conterminous U.S. This information is also shown graphically for the conterminous
U.S. in Figure 5. At a glance, it can be seen that the four forest classes
(deciduous, evergreen, mixed forest and woody wetlands) make up a significant
proportion of the NLCD, combining for a total of 32.1% of the area of the conterminous
U.S. Agriculture (pasture/hay, row crops, small grains, fallow and orchards/vineyards)
makes up about 26.4% of the surface area of the conterminous U.S. Urban classes
(low and high intensity residential, commercial/industrial/transportation)
account for 2.0 % of the surface area. Land cover area estimates are also available
for each of the 48 conterminous U.S. states, and may be obtained at: http://edcwww.cr.usgs.gov/programs/lccp/natllandcover.html
based on raw pixel counts from the National Land Cover Data set 1992.
Accuracy. Several rules for defining agreement between map and reference data may be applied given the information collected from NAPP photos and land cover maps. Comparing results across a range ofagreement protocols and data sub-sets permits evaluation of the reference data quality and more thorough investigation of thematic map accuracy (Congalton and Green 1993, Khorram et al. 1999). In this paper, accuracy results are briefly reported for the first four regions in the eastern United States combined (see Figure 2) by defining agreement as a match between the primary or alternate reference land-cover label and the mode land-cover label in a 3x3 pixel window surrounding the sample. Detailed region specific accuracy assessment results completed thus | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||