International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 05 Issue: 11 | Nov 2018 www.irjet.net p-ISSN: 2395-0072 © 2018, IRJET | Impact Factor value: 7.211 | ISO 9001:2008 Certified Journal | Page 1870 Survey on Grammar Checking and Correction using Deep Learning for Indian Languages Neethu S Kumar 1 , Supriya L P 2 1 MTech, Dept. of Computer Science & Engineering, Sree Buddha College of Engineering, Pathanamthitta, Kerala 2 Assistant Professor, Dept. of computer science & Engineering, Sree Buddha College of Engineering, Pathanamthitta, Kerala -------------------------------------------------------------------------***------------------------------------------------------------------------ Abstract - A grammar checker is one of the basic Natural Language Processing tools for any language. The grammar checker is widely used for detecting and correcting the sentence during a writing process. There are different kinds of grammar checkers. This paper describes a survey on grammar checker using deep learning for Indian languages. Grammar checking is a fundamental task for the writing process. The grammar consists of many rules including past, present, and future. There are different grammar checker for different languages which aims to improve the accuracy for minimum error. This survey concludes with different features of existing grammar checking. Key Words: Natural Language Processing, Grammar, Grammar checker, Rule-based, Statistical, Hybrid 1. INTRODUCTION Language is a communication between human beings. Human natural language can be defined as an interchangeability process between human beings. Grammar is elements in language and it contains sets of rules. Words are the basic grammatical units and these grammatical units combine together to form sentences. These sentences are formed by using some grammar rules. Grammar is a set of rules and these rules are used to form sentences. There are many grammatical errors occurring during the writing process. One of the main objectives of communication is to share information. This information can be defined in written-form or vocal-form. The most important in information content form is the validity of sentences in the language. Morphemes, phonemes, words, phrases, clauses, sentences, vocabulary and grammar are the blocks of language. All valid sentences of a language must follow the rules of that language. A Sentence is the combination of different words. Sentences with various types of errors are written by language learners of different backgrounds. Sentences can be classified into mainly three. First, simple sentences, which is a collection of one or more arguments. This sentence contains clause and mostly verb root and does not contain question words and negation. Second, complex sentences, which contain two clauses, having interdependence between main and dependent or subordinate clause. Third, compound sentences, which contain multiple clauses. Natural Language Processing is the one the subfield of artificial intelligence, which is the interaction between the computer and human languages. Most of the natural language processing based on handwritten rules. Grammar checking is one of the most common technology of natural language processing. There are many grammar checkers are used for different languages. The Grammar checker is a program which is used to check whether the sentence is grammatically correct or not. Many different types of grammar checker based on different approaches. They are Rule-based checking, statistics-based checking and hybrid checking. Most of the existing grammar checking are style checking, checking uncommon words and complicated sentence structure. 1.1 Statistical Grammar Checker In statistical grammar checker, which use an annotated corpus. The annotated corpus is maintained from different journals, magazines or documents. It ensures that the correctness of sentences by checking the input sentences with corpus. Here, there are mainly two ways to check the input sentence. First input text is directly checked with corpus and it check whether the sentence is matched with input text and it is tagged as grammatically errors otherwise checked the sentence is correct or incorrect. The second way is, the maintained corpus are generating some rules and the input sentence is checked by using these rules. When the corpus is maintained or add new data there is no update for the rules. This approaches has some disadvantage is that it is difficult to find the error in sentence and recognize the error in the system. 1.2 Rule Based Grammar Checking Most commonly used approaches is rule-based grammar checking. In rule-based grammar checking, the input sentence is checked by rules formed from the corpus. But in statistical approach, rules are manually generated. In the rule-based approach, the rules are easy to configure and also to modify these rules. One of the significant advantages of this approach is to handle the rules by one who does not have programming language and it also provides a detailed error message. The main characteristics of this approach are to handle all features of language and sentences also need to be completed and also it can easily handle the input sentence.