PE&RS September 2014 - page 873

PHOTOGRAMMETRIC ENGINEERING & REMOTE SENSING
September 2014
873
A Hierarchical Building Detection Method
for Very High Resolution Remotely Sensed
Images Combined with DSM Using Graph
Cut Optimization
Rongjun Qin and Wei Fang
Abstract
Detecting buildings in remotely sensed data plays an import-
ant role for urban analysis and geographical information
systems. This study proposes a hierarchical approach for
extracting buildings from very high resolution (9 cm
GSD
(Ground Sampling Distance)), multi-spectral aerial images
and matched
DSMs
(Digital Surface Models). There are three
steps in the proposed method: first, shadows are detected with
a morphological index, and corrected for
NDVI
(Normalized
Difference Vegetation Index) computation; second, the
NDVI
is incorporated using a top-hat reconstruction of the
DSM
to
obtain the initial building mask; finally, a graph cut optimi-
zation based on modified superpixel segmentation is carried
out to consolidate building segments with high probability
and thus eliminates segments that have low probability to be
buildings. Experiments were performed over the whole Vai-
hingen dataset, covering 3.4 km
2
with around 3000 buildings.
The proposed algorithm effectively extracted 94 percent of the
buildings with 87 percent correctness. This demonstrates that
the proposed method achieved satisfactory results over a large
dataset and has the potential for many practical applications.
Introduction
The identification and localization of buildings in an ur-
ban area is very important for planning, building analysis,
automatic 3
D
reconstruction of building models and change
detection (Qin and Gruen, 2014). The development of very
high resolution (
VHR
) remote sensing images (Qin
et al
., 2013)
creates a possible avenue to sense individual buildings in
an urban scenario, e.g., Ikonos with 1-meter resolution, or
Worldview with 0.5-meter resolution. Sensors with even
higher resolution are in the planning stages (e.g., Geoeye-2
and Worldview-3 with 0.3-meter resolution). However, this in-
creasing level of detail does not necessarily facilitate building
detection with an improved accuracy (Huang and Zhang,
2011). Indeed, more detailed image contents actually increase
spectral ambiguities in remotely sensed images, such as sym-
bol patterns on the road, and big vehicles. Therefore, research-
ers have devoted a lot of effort toward using multi-source
data and designing better detection strategies to increase the
building detection rate.
Multispectral images provide shadow information as
primitives for building locations. Furthermore, shadow
information are especially effective in single image based
methods (Huang and Zhang, 2012; Ok, 2013; Ok
et al
., 2013).
Meanwhile,
NDVI
data extracted from a multispectral image
can be used as vegetation indicators to eliminate trees. Vector
features such as parallel lines and corner junctions reveal
the characteristics of rectangular buildings, which have been
investigated and used to develop single-image based methods
for building detection (Lin and Nevatia, 1998; Sirmacek and
Unsalan, 2011; Sirmacek and Unsalan, 2010; Sirmaçek and
Unsalan, 2009).
Lidar (Light Detection and Ranging) point clouds provide
height information for a ground scene and are used for build-
ing detection. By subtracting the
DTM
(Digital Terrain Model)
from the
DSM
(Digital Surface Model), a nDSM (normalized
DSM
) can be computed to obtain off-terrain points for build-
ing detection (Weidner and Förstner, 1995). In addition, the
multi-return characteristics of lidar provide useful infor-
mation to eliminate the vegetation for point clouds based
methods (Ekhtari
et al
., 2008; Meng
et al
., 2009), to increase
the accuracy of building detection.
Both multispectral image and lidar point clouds have their
advantages and deficiencies. Complex algorithms based on a
single image usually have assumptions concerning building
distribution and sometimes are only able to detect certain
types of buildings. For example, methods based on feature
point extraction from a single image are only able to detect iso-
lated buildings with regular patterns, and methods relying on
parallel lines are not able to detect dome roofs. As compared
to multi-spectral images, lidar point clouds provide accurate
height information, but less accurate boundaries. There are
also null values for lidar point clouds due to occlusion and
specular reflection from water surfaces on the roofs. Therefore,
integration of both sources is a possible direction for improv-
ing building detection accuracy as well as robustness.
There has been a spate of integrated methods proposed in
the literature. Rottensteiner
et al
. (2007) and Rottensteiner
et
al
. (2005) proposed a supervised classification-based build
Rongjun Qin is with the Singapore ETH Center, Future Cities
Laboratory, ETH, Zurich. 1 CREATE Way, #06-01 CREATE
Tower, Singapore 138602 (
).
Wei Fang is with the Singapore ETH Center, Future Cities
Laboratory, ETH, Zurich. 1 CREATE Way, #06-01 CREATE
Tower, Singapore 138602, and the State Key Laboratory of
Information Engineering in Surveying, Mapping and Remote
Sensing (LIESMARS), Wuhan University, China. #129, Luoyu
Road, Wuchang District, LIESMARS, Wuhan University, Wu-
han, P. R. China, 430079.
Photogrammetric Engineering & Remote Sensing
Vol. 80, No. 9, September 2014, pp. 873–883.
0099-1112/14/8009–873
© 2014 American Society for Photogrammetry
and Remote Sensing
doi: 10.14358/PERS.80.9.873
811...,863,864,865,866,867,868,869,870,871,872 874,875,876,877,878,879,880,881,882,883,...914
Powered by FlippingBook