1057-7149 (c) 2018 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information. This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIP.2018.2801119, IEEE Transactions on Image Processing IEEE TRANSACTIONS ON IMAGE PROCESSING. VOL. X, NO. X, JANUARY 2018 1 Hookworm Detection in Wireless Capsule Endoscopy Images with Deep Learning Jun-Yan He, Xiao Wu * , Member, IEEE, Yu-Gang Jiang, Member, IEEE, Qiang Peng, and Ramesh Jain, Life Fellow, IEEE Abstract—As one of the most common human helminths, hookworm is a leading cause of maternal and child morbidity, which seriously threatens human health. Recently, wireless cap- sule endoscopy (WCE) has been applied to automatic hookworm detection. Unfortunately, it remains a challenging task. In recent years, deep convolutional neural network (CNN) has demonstrat- ed impressive performance in various image and video analysis tasks. In this paper, a novel deep hookworm detection framework (DHDF) is proposed for WCE images, which simultaneously models visual appearances and tubular patterns of hookworms. This is the first deep learning framework specifically designed for hookworm detection in WCE images. Two CNN networks, name- ly edge extraction network and hookworm classification network, are seamlessly integrated in the proposed framework, which avoid the edge feature caching and speed up the classification. Two edge pooling layers are introduced to integrate the tubular regions induced from edge extraction network and the feature maps from hookworm classification network, leading to enhanced feature maps emphasizing the tubular regions. Experiments have been conducted on one of the largest WCE datasets with 440K WCE images, which demonstrate the effectiveness of the proposed hookworm detection framework. It significantly outperforms the state-of-the-art approaches. The high sensitivity and accuracy of the proposed method in detecting hookworms shows its potential for clinical application. Index Terms—Hookworm detection, deep learning, convolu- tional neural network, computer-aided detection, wireless capsule endoscopy. I. I NTRODUCTION H OOKWORM is an infection by a parasitic bloodsucking roundworm. It is a leading cause of maternal and child morbidity in developing countries of the tropics and subtropics due to poor sanitation. Hookworm infection seriously threatens human health, which will impair the physical and intellectual development of children. It is reported that hookworm has affected more than 600 million people worldwide [1]. As a miniature medical device for gastrointestinal (GI) diag- nosis, Wireless Capsule Endoscopy (WCE) [2] travels through the digestive system to collect images or physiological data This work was supported in part by the National Natural Science Foundation of China (61772436, 61373121, and 61272290), and Sichuan Science and Technology Innovation Seedling Fund (2017RZ0015, 2017018). Asterisk indicates corresponding author. Jun-Yan He, Xiao Wu and Qiang Peng are with the School of Infor- mation Science and Technology, Xipu Campus, Southwest Jiaotong Uni- versity, Chengdu, 611756 China. (e-mail: junyanhe1989@gmail.com; wux- iaohk@swjtu.edu.cn; qpeng@swjtu.edu.cn). Yu-Gang Jiang is with the School of Computer Science, Fudan University, Shanghai. (email: ygj@fudan.edu.cn). Ramesh Jain is with School of Information and Computer Science, Univer- sity of California, Irvine, USA, e-mail: jain@ics.uci.edu. Image Sequence of Wireless Capsule Endoscopy Wireless Capsule Endoscopy (WCE) Automatic Hookworm Detection Fig. 1. Wireless Capsule Endoscopy (WCE) will take two or more color images per second to capture the whole gastrointestinal (GI) tract after swallowed by the patient. These WCE images are then analyzed by automatic abnormal detection software. after swallowed by the patient. It will take two or more color images of GI tract per second, which will last for a few hours to capture the whole GI tract, totally around 50, 000 images. It is a laborious and tedious process for trained endoscopists to identify suspicious areas and analyze the potential diseases, which usually take a couple of hours to manually examine these images. To assist the endoscopists, a series of automatic lesion detection solutions have been proposed recently. The process is illustrated in Fig. 1. It is reported that over one million patients globally have been examined with WCE, which has been widely used for several inflammatory bowel diseases and disorders in recent years, such as bleeding [3], [4], polyp [5]–[7], ulcer [5], tumor [8], Crohns disease [9], and so on. Unfortunately, automatic hookworm detection [10], [11] in WCE images has not been fully explored. Automatic hookworm detection in WCE images remains a challenging task. The quality of WCE images is usually poor due to the hardware limitation and the light condition. Its resolution is only 256 × 256 pixels. The free motion of the capsule and the contractions that the gut undertakes produce various orientations and perspectives of the scene. There exists complex structure for different parts of the intestinal tract (stomach, duodenum, jejunum-ileum, and cecum), presenting various appearances with multiple colors and textures. The existence of diverse extraneous matters mixed in GI tract, such as food, stool, bile and bubbles, seriously influences the detection. Moreover, the hookworms demonstrate different shapes, widths and bend orientations. These challenges pose a great difficulty for automatic hookworm detection. Due to the superior ability of learning mid-level and high- level abstractions obtained from raw data, deep learning, in