PE&RS May 2018 Public - page 249

Unsupervised Source Selection
for Domain Adaptation
Karsten Vogt, Andreas Paul, Jörn Ostermann, Franz Rottensteiner, and Christian Heipke
Abstract
The creation of training sets for supervised machine learning
often incurs unsustainable manual costs. Transfer learning
(
TL
) techniques have been proposed as a way to solve this
issue by adapting training data from different, but related
(source) datasets to the test (target) dataset. A problem in
TL
is how to quantify the relatedness of a source quickly and
robustly. In this work, we present a fast domain similar-
ity measure that captures the relatedness between datas-
ets purely based on unlabeled data. Our method transfers
knowledge from multiple sources by generating a weighted
combination of domains. We show for multiple datasets
that learning on such sources achieves an average overall
accuracy closer than 2.5 percent to the results of the target
classifier for semantic segmentation tasks. We further ap-
ply our method to the task of choosing informative patches
from unlabeled datasets. Only labeling these patches en-
ables a reduction in manual work of up to 85 percent.
Introduction
Supervised classification plays an important role for ex-
tracting semantic information from remote sensing imagery.
From statistical considerations, it can be expected that the
estimation of any complex model with high accuracy will
require large amounts of training data. While unlabeled data
are abundant and are already used successfully in unsuper-
vised and semi-supervised learning methods, they cannot
completely replace the dependence on labeled data. On the
other hand, the acquisition of high quality, densely sampled
and representative labeled samples is expensive and a time
consuming task. Transfer Learning (
TL
) is a paradigm that
strives to vastly reduce the amount of required training data
by utilizing knowledge from related learning tasks (Thrun and
Pratt, 1998; Pan and Yang, 2010). In particular, the aim of
TL
is to adapt a classifier trained on data from a
source domain
to a
target domain
. The only assumption to be made is that
these domains are different but related. We are interested in
one specific setting of
TL
called domain adaptation (DA). DA
methods assume the source and target domains to differ only
by the marginal distributions of the features and the posterior
class distributions (Bruzzone and Marconcini, 2009). The
performance of DA depends on how the source is related to
the target (Eaton
et al.
, 2008). From that point of view, DA
can be divided into two steps: find the most similar sources
and transfer knowledge from these sources to the target. In
this context, the major challenge in source selection is how to
measure the similarity of domains.
In this paper, we will address the problems of searching
for similar sources, also known as
source selection
, and of
integrating the results into DA. As unlabeled data are abun-
dant, our proposed method is only based on similarity mea-
surements between the marginal distributions of the features
in the source and target domains. We apply our source selec-
tion method to two different data acquisition settings: domain
selection and domain ranking. In
domain selection
, given
a target domain and a list of candidate source domains, we
assign weights to these sources based on the
Maximum Mean
Discrepancy
(
MMD
) metric to the target. For these candidate
source domains, we assume that some labeled training data
is available from earlier surveys. We then apply
multi-source
selection
by transferring knowledge from multiple weighted
source domains simultaneously. Additionally, we extend the
approach for DA presented in (Paul
et al.
, 2016) so that it can
benefit from multi-source selection. For the
domain ranking
setting, we have to process many initially unlabeled target
domains while no training data is available. Using our multi-
source selection algorithm, our goal is to rank these domains
in terms of their informativeness. This information helps us to
select the most important domains for manual labeling, which
leads to a reduced effort for the generation of training data
while keeping classification error at an acceptable level. Fi-
nally, we propose an improvement of the
MMD
metric for the
application in source selection with many candidate sources.
This Asymmetric Maximum Mean Discrepancy is able to sig-
nificantly reduce the memory footprint for each source while
featuring a linear runtime complexity by exploiting the asym-
metric relationship between target and source domains. We
evaluate our methods on the Vaihingen and Potsdam datasets
from the ISPRS 2D semantic labeing challenge (Wegner
et al.
,
2016) and on a third, even more challenging, dataset based on
aerial imagery of three German cities.
Related Work
In our work, we use notation according to Pan and Yang
(2010). A domain
D
={
X
, P(
X
)} consists of a feature space
X
and
a marginal probability distribution P(
X
) with
X
X
. A task for
a given domain is defined as
T
={
,
h
(·)}, consisting of a label
space
and a predictive function
h
(·). The predictive func-
tion can be learned from the training data {
x
r
,
C
r
}, where
x
r
X
and
C
r
. We consider a target
T
, for which we want to learn
a predictive function
h
(
x
), and a source
S
, from which some
knowledge can be transferred. Both
T
and
S
are fully described
by their domains and their tasks. In our work, we consider at
least one source domain
D
S
and only one target domain
D
T
for
the
domain selection
setting, and more than one target domain
for the
domain ranking
setting. There are different settings of
TL
. Our focus is on DA, which is a special sub-category of the
Karsten Vogt and Jörn Ostermann are with the Institute für
Informationsverarbeitung, Leibniz Universität Hannover
(
).
Andreas Paul, Franz Rottensteiner, and Christian Heipke are
with the Institute of Photogrammetry and Geoinformation,
Leibniz Universität, Hannover.
Photogrammetric Engineering & Remote Sensing
Vol. 84, No. 5, May 2018, pp. 249–261.
0099-1112/18/249–261
© 2018 American Society for Photogrammetry
and Remote Sensing
doi: 10.14358/PERS.84.5.249
PHOTOGRAMMETRIC ENGINEERING & REMOTE SENSING
May 2018
249
227...,239,240,241,242,243,244,245,246,247,248 250,251,252,253,254,255,256,257,258,259,...330
Powered by FlippingBook