PE&RS July 2018 Full - page 462

hard to obtain images taken by UAVs

containing the entire

plant structure without any occlusion from a densely planted

crop field. Therefore, we decided to ignore the morphological

information by using patch sequences. This strategy enables

our proposed method to be applicable and robust in detecting

the drought plant stress condition in challenging datasets,

such as the

MSTCivil Dataset

Feature Extraction

After we obtain the patch sequences, instead of using hand-

crafted features, we use Convolutional Neural Networks (

CNN

)

to extract discriminative features from each image in the patch

sequence. In the previous works, handcrafted features have

been widely used for representing crop stress conditions [35]

[19]. However, due to the complex physiological effects of the

drought stress condition, handcrafted features, which are most-

ly focusing on specific characteristics, might discard signifi-

cant amounts of the underlying conceptual information. The

difference between any handcrafted features versus features

learned by

CNN

is that multi-layered learning models, such as

CNN

, not only are able to explore low-level features from lower

layers, but also can yield conceptual abstractions from higher

layers. Hence,

CNN

is suitable for our feature extraction task.

In this work, the pre-trained

CNN

model,

VGG-16

[26]

(shown in Figure 4), is adopted for the feature extraction.

Based on the

VGG-16

model with weights pre-trained on

ImageNet [5], we first proceed to fine-tune the model for the

drought plant classification task, and then we use the fine-

tuned model as a feature extractor for our feature extraction

task on the image patch sequences.

Fine-tuning a deep learning network is a procedure based

on the concept of transfer learning [2][6]. We first initialize

the

VGG-16

model using the weights learned from the Ima-

geNet dataset. Then, we truncate the last layer of the model,

which is a softmax layer that targets at 1,000 classes of the

ImageNet dataset, and replace it with a new softmax layer that

targets at two classes (the drought condition and the control

condition). The new softmax layer is trained using the back-

propagation algorithm with our plant image data, which are

the image patches in the patch sequences (the image patch

will inherit the label from the patch sequence it belongs to).

In order to transfer the knowledge learned from the broad do-

main (ImageNet dataset) into our specific domain, we freeze

the weights for the first ten layers so that they remain intact

throughout the fine-tuning process. To fine-tune the model,

we minimize the cross entropy function using the stochastic

gradient descent algorithm with an initial learning rate of

-4

, which is smaller than the learning rate for training the

model from scratch. Finally, we use the fine-tuned model to

extract spatial features from the image in the patch sequence.

Each image in the patch sequence will be fed to the fine-tuned

model as the input, which is passed through a stack of convo-

lutional layers and three fully connected layers. The activa-

tion before the last fully connected layer will be considered as

the extracted feature vector of the input. Thus, after the fea-

ture extraction step, the patch sequence will be represented

by a sequence of feature vectors for the classification, where

each image in the patch sequence will be represented by a 1 ×

4096 feature vector in the feature vector sequence.

Bidirectional Long-Short Term Memory Recurrent Neural Network

Recurrent Neural Network (

RNN

) is a layered neural network

that uses its cyclic connection to learn temporal dependencies

in the sequential data. The structure of a simple

RNN

is shown

in Figure 5. Compared to feed-forward layered neural net-

works, for instance the multilayer perceptron [22] that can only

learn static pattern mappings,

RNN

can propagate prior time in-

formation forward to the current time for learning the context

information in a sequence of feature vectors. In other words,

the hidden layer of an

RNN

serves as a memory function.

Figure 5. An example of recurrent neural network.

RNN

can be described mathematically as follows.

Suppose there is a sequence of feature vectors denoted as

∈

[1,

]. In the

RNN

, the hidden layer output vector

and

the output layer

are calculated as follows:

UAV, unmanned aerial vehicle, commonly known as a drone, is an

aircraft without a human pilot aboard.

Figure 4. The architecture of the

VGG-16

CNN

model [26].

462

July 2018

PHOTOGRAMMETRIC ENGINEERING & REMOTE SENSING

SEO Version

Warning.

You are currently viewing the SEO version of !text.
It has a number of design and functionality limitations.

We recommend viewing the Flash version or the basic HTML version of this publication.

403...,452,453,454,455,456,457,458,459,460,461 463,464,465,466,467,468,469,470