KNN classification of Kannada Characters using Hu’s Seven Variants and Zernike Moment Features Duddela Sai Prashanth 1 , Research Scholar, SCSVMV University dsaip13@gmail.com Bharath Bhushan 3 , Sahyadri College, Mangalore sn.bharath@gmail.com C N Panini 2 , Research Scholar, SCSVMV University cnpanini@kanchiuniv.ac.in Abstract— Identifying the text is one of the promising field of research in the domain of computer vision and pattern recognition. This paper copes with identity of kannada text. Eliminating noise and extracting the textual content from the scanned or captured picture is first step. Segmenting the lines and characters is the second step which is essential. Noise removal and extracting the textual content can be done by way of the usage of any noise filter and foreground subtraction algorithm. Otsu set of rules facilitates to gain the task of foreground extraction. Horizontal and Vertical Profiling is a method of extracting lines and words from the image document. Extracting the knowledge from the dataset the use of Hu’s Seven variations and Zernike Moments features helps to come over many problem. After training method knowledge is being generated through the usage of the above mentioned methods. KNN classifier is used to understand the unknown characters through the quest approach through calculating the capabilities. Keywords—Computer Vision; Character Identificatoin; OCR Techniques; I. INTRODUCTION India is multilingual country with 22 reliable languages and greater than 1600 languages in lifestyles, kannada is one of the professional languages and extensively used in the state of Karnataka. Identification of the text written via human is one the promising studies owing its great place of applications concerned. Complexity starts off evolved from extracting textual content out of the image that scanned or captured. Segmentation of the text from the image includes two steps horizontal profiling and vertical profiling which leads to separate lines and words inside the image. Preprocessing of the image are the default steps involved in any image processing technique to put off the noise and to split foreground and background. On this research, figuring out hand written kannada textual content through extracting feature of the textual content written and developing knowledge database for the textual content written. The character level segmentation is done after removal noise through salt pepper filter and for the foreground and background separation, a traditional and efficient method which became followed with the aid of many researchers is Otsu algorithm. From the foreground, horizontal profiling for the line segmentation and vertical profiling for word or character segmentation. Hu’s Seven Variants and Zernike Moments are used to extract the features and developing knowledge database for the input images. II. DATA COLLECTION AND PRE-PROCESSING A. Data Collection The valuation of the method that is proposed is only possible with the appropriate dataset. A dataset of our own is created for this research because of the unavailability of any standard dataset of kannada language. A dataset from 20 students of college with the age group of 21 to 25 years are considered. All the alphabets of kannada language are written on a paper and scanned. All the training and testing of the data is being done with the help of this data set. Variation in writing is the obvious thing that happens which helps to generate a strong knowledge for test. Fig. 1 Sample Data Set B. Pre-Processing For any image to get better performance noise should be removed from the image. For the removal of noise from the image, morphological operations are used. For extracting the foreground from the background Ostu Algorithm is used which converts the image into binary format. Black pixels in the image having 0’s and white pixels with 1’s are separated henceforth, all the text is extracted from the scanned image.