Received: 20 November 2019
|
Revised: 27 April 2020
|
Accepted: 14 June 2020
DOI: 10.1002/rob.21975
REGULAR ARTICLE
The effect of data augmentation and network simplification
on the image‐based detection of broccoli heads with Mask
R‐CNN
Pieter M. Blok
1,2
| Frits K. van Evert
1
| Antonius P. M. Tielen
3
|
Eldert J. van Henten
2
| Gert Kootstra
2
1
Agrosystems Research, Wageningen
University & Research, Wageningen,
The Netherlands
2
Farm Technology Group, Wageningen
University & Research, Wageningen,
The Netherlands
3
Greenhouse Horticulture, Wageningen
University & Research, Wageningen,
The Netherlands
Correspondence
Pieter M. Blok, Agrosystems Research,
Wageningen University & Research,
Droevendaalsesteeg 1, 6708 PB Wageningen,
The Netherlands.
Email: pieter.blok@wur.nl
Funding information
Tony Wisdom (Skagit Valley Farm)
Abstract
In current practice, broccoli heads are selectively harvested by hand. The goal of our
work is to develop a robot that can selectively harvest broccoli heads, thereby
reducing labor costs. An essential element of such a robot is an image‐processing
algorithm that can detect broccoli heads. In this study, we developed a deep learning
algorithm for this purpose, using the Mask Region‐based Convolutional Neural
Network. To be applied on a robot, the algorithm must detect broccoli heads from
any cultivar, meaning that it can generalize on the broccoli images. We hypothesized
that our algorithm can be generalized through network simplification and data
augmentation. We found that network simplification decreased the generalization
performance, whereas data augmentation increased the generalization perfor-
mance. In data augmentation, the geometric transformations (rotation, cropping,
and scaling) led to a better image generalization than the photometric transfor-
mations (light, color, and texture). Furthermore, the algorithm was generalized on a
broccoli cultivar when 5% of the training images were images of that cultivar. Our
algorithm detected 229 of the 232 harvestable broccoli heads from three cultivars.
We also tested our algorithm on an online broccoli data set, which our algorithm
was not previously trained on. On this data set, our algorithm detected 175 of the
176 harvestable broccoli heads, proving that the algorithm was successfully gen-
eralized. Finally, we performed a cost‐benefit analysis for a robot equipped with our
algorithm. We concluded that the robot was more profitable than the human har-
vest and that our algorithm provided a sufficient basis for robot commercialization.
KEYWORDS
agriculture, computer vision, learning, perception, sensors
1 | INTRODUCTION
In agriculture, numerous tasks depend on human labor. This labor is
getting more expensive and more scarce, which causes problems for
tasks that are done by hand, such as the selective harvest of crops.
Selective hand‐harvest involves the visual assessment of the crop,
followed by the harvest of only those specimens that have reached
the desired size, quality, or maturity. A crop that is selectively har-
vested by hand, is broccoli (Brassica oleracea var. italica). In the
Netherlands, broccoli is usually hand‐harvested three times in one
growing season (Kwin, 2018). Cost studies show that the hand‐
harvest of broccoli can take up to 107 man‐hours per hectare and
J Field Robotics. 2020;1–20. wileyonlinelibrary.com/journal/rob © 2020 Wiley Periodicals LLC | 1