PE&RS May 2018 Public

Unsupervised Source Selection

for Domain Adaptation

Karsten Vogt, Andreas Paul, Jörn Ostermann, Franz Rottensteiner, and Christian Heipke

Abstract

The creation of training sets for supervised machine learning

often incurs unsustainable manual costs. Transfer learning

(

TL

) techniques have been proposed as a way to solve this

issue by adapting training data from different, but related

(source) datasets to the test (target) dataset. A problem in

TL

is how to quantify the relatedness of a source quickly and

robustly. In this work, we present a fast domain similar-

ity measure that captures the relatedness between datas-

ets purely based on unlabeled data. Our method transfers

knowledge from multiple sources by generating a weighted

combination of domains. We show for multiple datasets

that learning on such sources achieves an average overall

accuracy closer than 2.5 percent to the results of the target

classifier for semantic segmentation tasks. We further ap-

ply our method to the task of choosing informative patches

from unlabeled datasets. Only labeling these patches en-

ables a reduction in manual work of up to 85 percent.

Introduction

Supervised classification plays an important role for ex-

tracting semantic information from remote sensing imagery.

From statistical considerations, it can be expected that the

estimation of any complex model with high accuracy will

require large amounts of training data. While unlabeled data

are abundant and are already used successfully in unsuper-

vised and semi-supervised learning methods, they cannot

completely replace the dependence on labeled data. On the

other hand, the acquisition of high quality, densely sampled

and representative labeled samples is expensive and a time

consuming task. Transfer Learning (

TL

) is a paradigm that

strives to vastly reduce the amount of required training data

by utilizing knowledge from related learning tasks (Thrun and

Pratt, 1998; Pan and Yang, 2010). In particular, the aim of

TL

is to adapt a classifier trained on data from a

source domain

to a

target domain

. The only assumption to be made is that

these domains are different but related. We are interested in

one specific setting of

TL

called domain adaptation (DA). DA

methods assume the source and target domains to differ only

by the marginal distributions of the features and the posterior

class distributions (Bruzzone and Marconcini, 2009). The

performance of DA depends on how the source is related to

the target (Eaton

et al.

, 2008). From that point of view, DA

can be divided into two steps: find the most similar sources

and transfer knowledge from these sources to the target. In

this context, the major challenge in source selection is how to

measure the similarity of domains.

In this paper, we will address the problems of searching

for similar sources, also known as

source selection

, and of

integrating the results into DA. As unlabeled data are abun-

dant, our proposed method is only based on similarity mea-

surements between the marginal distributions of the features

in the source and target domains. We apply our source selec-

tion method to two different data acquisition settings: domain

selection and domain ranking. In

domain selection

, given

a target domain and a list of candidate source domains, we

assign weights to these sources based on the

Maximum Mean

Discrepancy

(

MMD

) metric to the target. For these candidate

source domains, we assume that some labeled training data

is available from earlier surveys. We then apply

multi-source

selection

by transferring knowledge from multiple weighted

source domains simultaneously. Additionally, we extend the

approach for DA presented in (Paul

et al.

, 2016) so that it can

benefit from multi-source selection. For the

domain ranking

setting, we have to process many initially unlabeled target

domains while no training data is available. Using our multi-

source selection algorithm, our goal is to rank these domains

in terms of their informativeness. This information helps us to

select the most important domains for manual labeling, which

leads to a reduced effort for the generation of training data

while keeping classification error at an acceptable level. Fi-

nally, we propose an improvement of the

MMD

metric for the

application in source selection with many candidate sources.

This Asymmetric Maximum Mean Discrepancy is able to sig-

nificantly reduce the memory footprint for each source while

featuring a linear runtime complexity by exploiting the asym-

metric relationship between target and source domains. We

evaluate our methods on the Vaihingen and Potsdam datasets

from the ISPRS 2D semantic labeing challenge (Wegner

et al.

,

2016) and on a third, even more challenging, dataset based on

aerial imagery of three German cities.

Related Work

In our work, we use notation according to Pan and Yang

(2010). A domain

D

={

X

, P(

X

)} consists of a feature space

X

and

a marginal probability distribution P(

X

) with

X

∈

X

. A task for

a given domain is defined as

T

={



,

h

(·)}, consisting of a label

space



and a predictive function

h

(·). The predictive func-

tion can be learned from the training data {

x

r

,

C

r

}, where

x

r

∈

X

and

C

r

∈



. We consider a target

T

, for which we want to learn

a predictive function

h

(

x

), and a source

S

, from which some

knowledge can be transferred. Both

T

and

S

are fully described

by their domains and their tasks. In our work, we consider at

least one source domain

D

S

and only one target domain

D

T

for

the

domain selection

setting, and more than one target domain

for the

domain ranking

setting. There are different settings of

TL

. Our focus is on DA, which is a special sub-category of the

Karsten Vogt and Jörn Ostermann are with the Institute für

Informationsverarbeitung, Leibniz Universität Hannover

(

vogt@tnt.uni-hannover.de

).

Andreas Paul, Franz Rottensteiner, and Christian Heipke are

with the Institute of Photogrammetry and Geoinformation,

Leibniz Universität, Hannover.

Photogrammetric Engineering & Remote Sensing

Vol. 84, No. 5, May 2018, pp. 249–261.

0099-1112/18/249–261

and Remote Sensing

doi: 10.14358/PERS.84.5.249

PHOTOGRAMMETRIC ENGINEERING & REMOTE SENSING

May 2018

249

PE&RS May 2018 Public - page 249

Warning.