Copyright © 2018 Authors. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. International Journal of Engineering & Technology, 7 (4.11) (2018) 49-54 International Journal of Engineering & Technology Website: www.sciencepubco.com/index.php/IJET Research paper A Practical Plant Diagnosis System for Field Leaf Images and Feature Visualization E. Fujita 1 *, H. Uga 2 , S. Kagiwada 3 , H. Iyatomi 1 1 Applied Informatics, Graduate School of Science and Engineering, Hosei University, Japan 2 Saitama Agricultual Technology Research Center, Japan 3 Clinical Plant Science, Faculty of Bioscience and Apllied Chemistry, Hosei University, Japan *Corresponding author E-mail: iyatomi@hosei.ac.jp Abstract An accurate, fast and low-cost automated plant diagnosis system has been called for. While several studies utilizing machine learning techniques have been conducted, significant issues remain in most cases where the dataset is not composed of field images and often includes a substantial number of inappropriate labels. In this paper, we propose a practical automated plant diagnosis system. We first build a highly reliable dataset by cultivating plants in a strictly controlled setting. We then develop a robust classifier capable of analyz- ing a wide variety of field images. We use a total of 9,000 original cucumber field leaf images to identify seven typical viral diseases, Downy mildew and healthy plants including initial symptoms. We also visualize the key regions of diagnostic evidence. Our system attains 93.6% average accuracy, and we confirm that our system captures important features for the diagnosis of Downy mildew. Keywords: convolutional neural networks; feature visualization; image processing; plant diagnosis. 1. Introduction Plant diseases affect agricultural production all over the world [1- 3]. To minimize the damage and avoid secondary infection, we have to identify the infected plants and apply an appropriate treatment as soon as possible (e.g., removal of infected plants or pesticide application). Plant diagnosis is generally conducted through visual examination by experts with subsequent genetic testing applied as necessary, thus it is usually expensive and time- consuming. In such circumstances, methodologies for automated plant diagno- sis characterized by accuracy, speed and low costs have been re- quested by the agricultural industry. Several studies have been carried out in response to such requests [4-23]. In [4] used support vector machines (SVM) to classify rice plant diseases and attained 92.7% accuracy. In [5] analysed leaf and stem images of plants with an artificial neural network classifier. Their classifier achieved around 93% accuracy in classifying them into six classes (five diseases and a healthy state). In [7] also used an artificial neural network classifier and showed 87.8% in fungal disease diagnosis. In [12] discriminated cassava diseases in five categories (four diseases and a healthy state) and estimated their severity in five grades from healthy (= 1) to terminal (= 5). They used a com- bination of their original feature descriptors and classifiers such as linear SVM. They claimed 99.98% and nearly 99% accuracy in disease severity estimation and classification, respectively. In [18] investigated six kinds of Cercospora leaf spots of sugar cane with an evaluation of common statistical and handmade image features. Their method attained 82% accuracy. These methods successfully established preferable performance for their own target task. However, since they are designed based on conventional pattern recognition techniques, i.e. a sequential process of (1) prepro- cessing including segmentation, detection of the regions of inter- ests (ROI), etc., (2) development of hand-crafted features specially designed for a specific task and (3) classification. Thus, they usu- ally have constraints on their usage. In recent years, a new machine learning schema called deep learn- ing has demonstrated many promising achievements in a wide range of industries. Convolutional neural networks (CNNs) are a principal aspect of deep learning techniques specialised for ma- chine learning including computer vision. CNNs automatically capture efficient image features for classification from the training images as a part of their learning process. Due to that, they not only significantly reduced the need for the complicated hand-made processes mentioned previously but also achieved high classifica- tion performance. Recently, several applications for automated plant diagnosis relying on deep learning have also been proposed [11, 15, 17, 20-23]. In [15] used a total of 54,306 plant leaf images consisting of 14 crop species and 26 diseases for a total of 38 clas- ses of crop-disease pairs from PlantVillage [24] and built CNNs classifiers. Their best score reached an overall accuracy of 99.35%. However, all the leaves used in their study were physically cropped and each leaf was separately placed in front of a uniform colored background and photographed. The conditions are quite different to what we observe in the field, thus we see a noticeable difference in performance in practical situations. In fact, they also noted in their manuscript that the accuracy dropped to around 31% in a different setting from the training images. In addition, we found a significant number of inappropriate label assignments in the PlantVillage dataset. This is a serious problem that open da- taset inherently has. Note that the PlantVillage dataset is not cur- rently available to the public. In [22] analysed apple leaves for classifying four kinds of diseases with CNNs. They attained an excellent average accuracy of 97.62%. However, their study also used cropped leaf images, as well as the PlantVillage dataset and therefore these systems cannot be directly applied to practical situations.