Tomato Flower Detection and Counting in
Greenhouses Using Faster Region-Based
Convolutional Neural Network
Umme Fawzia Rahim and Hiroshi Mineno
Graduate School of Integrated Science and Technology, Shizuoka University, Japan
Email: fawzia@minelab.jp, mineno@inf.shizuoka.ac.jp
Abstract—To optimize fruit production and improve
profitability cultivators remove excess flowers and fruitlets
from plants and trees in the early growing season. The
proportion of the flowers to be removed is determined by
the flower intensity, i.e., the total number of flowers present
in a row in the greenhouse. Several automated computer
vision methods have been presented to estimate flower
intensity, but their overall performance is still far from
satisfactory. With the aim of designing a method for flower
detection which is robust to occlusions and to changes in
lighting conditions and camera position, this study presents
a technique in which a pre-trained Faster Region-based
Convolutional Neural Network (Faster R-CNN) is fine-
tuned, followed by a color-based thresholding process to
detect and count tomato flowers in greenhouses.
Experimental results on a dataset composed of greenhouse
tomato flower images acquired under different conditions,
demonstrate significantly high performance, with precision
and recall of 96.02% and 93.09%, respectively. The flower
count from the proposed technique is comparable with the
number counted manually with an error of – 4 to 3 flowers
per image.
Index Terms—agricultural engineering, computer vision,
deep learning, faster R-CNN, flower detection and counting
I. INTRODUCTION
Flower intensity has a major effect on fruit yield and
quality of fruits [1], [2]. Along with other factors such as
climate, flower intensity is especially critical to guide
thinning, which is the process of removing excess flowers
and fruitlets in the early growing season. Proper thinning
increases fruit market value, since it affects fruit size,
color, skin performance, firmness, soluble solids, sugar
and acid content.
Although flower intensity estimation is significant for
crop production, there has been relatively limited
advancement so far in automating flower counting.
Currently, this activity is typically performed manually.
However, manual counting is tedious, labor-intensive,
and prone to errors and uncertainties. Machine vision
systems using different types of image sensors and image
processing techniques can improve the efficiency of
manual counting and minimize labor cost. Flowers
Manuscript received May 16, 2020; revised October 12, 2020.
generally have very distinct color and texture from the
background. Several studies used traditional image
processing methods such as color and shape analysis to
segment flower pixels [3]-[5]. Flower intensity was
calculated using morphological operations on the
segmented flower pixels [5] or exploring the correlation
of flower pixel percentage [3], [4]. However, those
methods have their applicability hindered especially by
change in illumination, background clutter and occlusion
by leaves, stems or other flowers. In addition, most
existing methods estimate flower numbers from flower
pixel percentage instead of counting individual flowers.
Such techniques require adjustment of parameter
whenever changes in flower density (high/low) or in
camera position (distance and angle) occur.
Inspired by successful studies using deep
Convolutional Neural Networks (CNNs) in challenging
computer vision and object detection tasks, we propose a
robust method to detect and count tomato flowers in
variant greenhouse conditions using a state-of-the-art
object detector called Faster Region-based Convolutional
Neural Network (Faster R-CNN) [6]. In our approach, a
pre-trained Faster R-CNN is adopted through transfer
learning and is further tuned to become particularly
sensitive to tomato flowers. Finally, thresholding
according to color and size features is applied to each
identified flower region to eliminate misclassifications
and very small faraway flowers that we do not seek yet.
II. RELATED WORK
Many computer vision methods for automatic
identification of flowers in image have been proposed. In
a work aimed to estimate flowering in an apple orchard,
the researchers used simple color thresholding in order to
segment the white apple flowers from the background [7].
The images were acquired at night using artificial lighting
so lighting conditions were invariant and good for the
detection. However, when images are captured at day,
lighting conditions become a challenge. In a study on
estimating the intensity of lesquerella flower, the images
were transformed to HSI color space to perform the
segmentation [4]. The model estimated flower counts
with root mean squared errors that ranged from 159 to
194 flowers. Although the researchers used Monte Carlo
approach to minimize uncertainty in HSI parameters used
Journal of Image and Graphics, Vol. 8, No. 4, December 2020
©2020 Journal of Image and Graphics 107
doi: 10.18178/joig.8.4.107-113