Received April 30, 2021, accepted May 27, 2021, date of publication June 3, 2021, date of current version June 11, 2021. Digital Object Identifier 10.1109/ACCESS.2021.3086102 Cr-Prom: A Convolutional Neural Network-Based Model for the Prediction of Rice Promoters MUHAMMAD SHUJAAT 1,2 , SEUNG BEOP LEE 3 , (Member, IEEE), HILAL TAYARA 3 , AND KIL TO CHONG 1,4 , (Member, IEEE) 1 Department of Electronics and Information Engineering, Jeonbuk National University, Jeonju 54896, South Korea 2 Department of Computer Sciences, Bahria University, Lahore 54600, Pakistan 3 School of International Engineering and Science, Global Frontier College, Jeonbuk National University, Jeonju 54896, South Korea 4 Advances Electronics and Information Research Center, Jeonbuk National University, Jeonju 54896, South Korea Corresponding authors: Hilal Tayara (hilaltayara@jbnu.ac.kr) and Kil To Chong (kitchong@jbnu.ac.kr) This work was supported in part by the National Research Foundation of Korea (NRF) Grant through the Korean Government [Ministry of science and ICT (MSIT)] under Grant 2020R1A2C2005612, and in part by the Brain Research Program of the National Research Foundation (NRF) through the Korean Government (MSIT) under Grant NRF-2017M3C7A1044816. ABSTRACT The promoter is a regulatory region of the DNA typically located upstream of a gene and plays a key role in regulating gene transcription. Accurate prediction of promoters is crucial for the analysis of gene expression patterns and for the development and understanding of genetic regulatory networks. Genomes of several species have been sequenced, and their gene content has been established to a large extent. Some bioinformatics algorithms have been developed for predicting promoters with high universality for all kinds of plants; however, few studies have been conducted to identify promoters in rice, which might affect the practical applications. Here, we present a rice promoter prediction tool, Cr-Prom. This predictor has been established using a series of sequence-based features and datasets extracted from the PlantProm and RAP-DB databases. We applied a convolutional neural network (CNN)-based strategy to construct a predictor with robust classification performance. To demonstrate our dominance, we ran experiments on a benchmark dataset using 5-fold cross-validation and compared our results with existing techniques using four figure of merits. In addition, CR-Prom was analyzed on an independent dataset. Based on the results, Cr-Prom outperformed the existing rice-specific promoter predictors. The Cr-Prom tool can be freely accessed at: http://nsclbio.jbnu.ac.kr/tools/Cr-Prom/ INDEX TERMS Promoters, convolutional neural network (CNN), computational biology, bioinformatics, rice genome. I. INTRODUCTION Rice is a cereal crop that belongs to the Orayza genus, representing a variety of rice, which can be divided into two subspecies, namely indica and japonica. Rice can also be divided into conventional and hybrid rice depending upon its production type. Being a vital direct cash crop, rice is the staple food for majority of the population all over the world. From basic studies to molecular breeding, researchers have played a significant role in boosting rice production world- wide. Owing to the rapid development of biotechnology and genetic engineering technology, scientists started analyzing and collating rice genome in 1998, and by 2002, the entire rice genome map had been interpreted. The rice genome The associate editor coordinating the review of this manuscript and approving it for publication was Shadi Alawneh . is the most completely sequenced genome among higher organisms [1]. Among the 37,500 genes identified, several have a significant role to play in agricultural production. For example, research on key genes can help increase the yield of rice [2] or change the photoperiod of rice. Transcription of protein genes and most non-coding RNA genes, as well as that of the DNA regions with uncertain functions, is performed in the nuclear genomes of eukary- otic organisms by RNA polymerase II (Pol II). Transcription controls cellular differentiation and function by initiating expression at specific genomic locations. Changes in gene regulation are the main driving factors for the majority of uni- versal diversification between species [3], [4] and phenotypic diversity within the same species [5]–[7]. Gene regulation has been targeted by a large number of genetic, biological, chemical, and computational studies. VOLUME 9, 2021 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ 81485