PE&RS March 2019 Public

Land Cover Classification in Combined Elevation

and Optical Images Supported by OSM Data,

Mixed-level Features, and

Non-local Optimization Algorithms

Dimitri Bulatov, Gisela Häufel, Lukas Lucks, and Melanie Pohl

Abstract

Land cover classification from a

irborne data is considered a

challenging task in Remote Sens

ing. Even in the case of avail-

able elevation data, shadows an

d strong intra-class variations

of appearances are abundant in urban terrain. In this paper,

we propose an approach for supervised land cover classifica-

tion that has three main contributions. Firstly, for the cumber-

some task of training data sampling we propose an algorithm

which combines the freely available OpenStreetMap data with

the actual sensor data and requires only a minimum of user

interaction. The key idea of this algorithm is to rasterize the

vector data using a fast segmentation result. Secondly, pixel-

wise classification may take long and be quite sensitive to the

resolution and quality of input data. Therefore, superpixel

decomposition of images, supported by a general framework

on operations with superpixels, guarantees fast grouping

of pixel-wise features and their assignment to one of four

important classes (building, tree, grass and road). Particularly

for extraction of street canyons lying in the shadowy regions,

high-level features based on stripes are introduced. Finally,

the output of a probabilistic learning algorithm can be post-

processed by a non-local optimization module operating on

Markov Random Fields, thus allowing to correct noisy results

using a smoothness prior. Extensive tests on three datasets of

quite different nature have been performed with two probabi-

listic learners: The well-known Random Forest and by far less

known Import Vector Machine are explored. Thus, this work

provides insights about promising feature sets for both classi-

fiers. The quantitative results for the

ISPRS

benchmark dataset

Vaihingen are promising, achieving up to 94.5% and 87.1%

accuracy on superpixel and on pixel level, respectively, de-

spite the fact that only around 10% of available labeled data

were used. At the same time, the results for two additional

datasets, validated with the autonomously acquired training

data, yielded a significantly lower number of misclassified

superpixels. This confirms that the proposed algorithm on

training data extraction works quite well in reducing errors of

second kind. However, it tends to extract predominantly huge

and easy-to-classify areas, while in complicated, ambiguous

regions, first type errors often occur. For this and other algo-

rithm shortcomings, directions of future research are outlined.

Introduction

Motivation

Land cover classification, especially in urban and semi-urban

environment, is a key step for creating semantic

3D

models

from airborne sensor d

ata. As mentioned e.g. by Bulatov et

al. (2014), the advantag

es of semantic models are: Higher

reality content, flexible

level of compression, as well as better

interpretability and int

eroperability on the non-expert users’

part. Buildings are modeled on a desired level of detail, trees

and vehicles are placed at positions they have been detected

and are represented by geo-specific models from a library,

etc. The emphasis of that and comparable studies (Haala,

2005; Lafarge and Mallet, 2012) was predominantly laid on

reconstruction of complicated objects, in particular, build-

ing roof types. However, not much effort was invested into

a precise and reliable subdivision of the underlying terrain

into different classes. Therefore, at most, few very discrimina-

tive features, such as elevation and color indexes (like

NDVI

,

normalized difference vegetation index), were considered to

separate buildings from trees, roads from vehicles, water bod-

ies from grass areas.

In order to perform a more systematic preparation of data

for the aforementioned reconstruction task, classification of

the underlying terrain is necessary. In real-case scenarios,

there are plenty of factors hindering a correct assignment

of pixels to classes, which would later result in incorrect

building outlines, wriggled street courses, etc. Examples of

assignment errors are sometimes related to seldom or overlap-

ping classes, such as hills covered by shrubbery and grass,

destroyed buildings, bridges in a non-negligible height over

the ground, etc. Even without taking these anomalies into

account or setting them right at a later point, latest develop-

ments brought about an extremely broad spectrum of air-

borne sensors and their products. Taking into account rather

heterogeneous scenes to be captured, this may provide very

particular patterns of texture, distributions of shadows and

types of objects to be classified. Under these circumstances,

it is not realistic to obtain a good classification result with-

out (1) taking into account examples of the data currently

investigated, (2) computing sometimes sophisticated features,

and (3) applying a classification algorithm supposed to learn

thresholds on features for separation of training data and thus

classify the test data. In other words, classification approaches

have three major ingredients: Training data, feature set, and

learning algorithm.

Contributions

In this work, we will focus on the three components men-

tioned above. Firstly, it is interesting to investigate to what

Fraunhofer Institute of Optronics, System Technologies

and Image Exploitation Gutleuthausstr. 1, 76275 Ettlingen,

Germany (

dimitri.bulatov@iosb.fraunhofer.de

).

Photogrammetric Engineering & Remote Sensing

Vol. 85, No. 3, March 2019, pp. 179–195.

0099-1112/18/179–195

and Remote Sensing

doi: 10.14358/PERS.85.3.179

PHOTOGRAMMETRIC ENGINEERING & REMOTE SENSING

March 2019

179

PE&RS March 2019 Public - page 179

Warning.