© 2015, IJARCSSE All Rights Reserved Page | 336 Volume 5, Issue 7, July 2015 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com A Naïve Bayes Approach for Word Sense Disambiguation Gurinder Pal Singh Gosal Department of Computer Science, Punjabi University, Patiala, Punjab, India Abstract- The word sense disambiguation (WSD) is the task ofautomatically selecting the correct sense given a context and it helps in solving many ambiguity problems inherently existing in all natural languages.Statistical Natural Language Processing (NLP),which is based on probabilistic, stochastic and statistical methods, has been used to solve many NLP problems.The Naive Bayes algorithm which is one of the supervised learning techniques has worked well in many classification problems. In the present work, WSD task to disambiguate the senses of different words from the standard corpora available in the “1998 SENSEVAL Word Sense Disambiguation (WSD) shared task” is performed by applying Naïve Bayes machine learning technique. It is observed that senses of ambiguous word having lesser number of part-of-speeches are disambiguated more correctly. Other key observation is that with lesser number of senses to be disambiguated, the chances of words being disambiguated with correct senses are more. Keywords— Word sense disambiguation, WSD, POS-filtering, ambiguity, Naïve Bayes, supervised learning I. INTRODUCTION The ambiguity in the senses of the words of different languages does exist inherently in all natural languages used by humans. There are many words in every language which carry more than one meaning for the same word. For example, the word ―chair‖ has one sense which means a piece of furniture and other sense of it means a person chairing say some session. So obviously we need some context to select the correct sense given a situation. Automatically selecting the correct sense given a context is in the core of solving many ambiguity problems. The word sense disambiguation (WSD) is the task to automatically determine which of the senses of an ambiguous (target) word is chosen in the specific use of the word by taking into consideration the context of word’s use [1,2]. Having an accurate and reliable word sense disambiguation has been the target of natural language community since long. The motivation and belief behind performing word sense disambiguation is that many tasks which are performed under the umbrella of NLP are highly benefitted with properly disambiguated word senses.Statistical NLP, a special approach of NLP based onthe probabilistic, stochastic and statistical methods, uses machine learning algorithms to solve many NLP problems. AS a branch ofartificial intelligence, machine learning involves computationallylearning patterns from given data, and applying to new or unseen data the pattern which were learned earlier. Machine learning is defined by Tom M.Mitchell as ―A computer program is said to learn from experience E with respect to some class of tasksT and performance measure P, if its performance at tasks in T,as measured by P, improves withexperience E [3].‖ Learning algorithms can be generally classified into three types: supervised learning, semi-supervised learning and unsupervised learning. Supervised learning technique is based on the idea of studying the features of positive and negative examples over a large collection of annotated corpus. Semi-supervised learning uses both labeled data and unlabeled data for the learning process to reduce the dependence on training data. In the unsupervised learning, decisions are made on the basis of unlabeled data. The methods of unsupervised learning are mostly built upon clustering techniques, similarity based functions and distribution statistics. For automatic WSD,supervised learningis one ofthe most successfulapproaches. II. RELATED WORK When the work started on handling of languages with automatic means, the problem of WSD drew the interest of the researchers at the same time. Therefore, we can say that the WSD task is one of the oldest tasks in computational linguistics.The problem of WSD was introduced to the community by Weaver in 1949 when he presented it as a basic task of MachineTranslation (MT). In his well-known Memorandum on Machine Translation, he stressed that by looking at the context in which the word occurs, this problem of multiple senses of words can be dealt with [4]. The research came out with the importance of immediate context or adjacent words in doing the disambiguation of the senses. The role of the domain in WSD task was also analyzed by Weaver and a lot of work followed in this direction after that generating many specialized dictionaries [5, 6] for sense disambiguation. There was a view amongst the research community for long that machine translation and word sense disambiguation are tasks have to be dealt independently. WSD was thought to be a very difficult task to achieve given the limited set of resources available at that time.In another study the role of syntactic relations in the task of WSD was discussed by Reifler in his work where he stressed upon the role of grammatical structure [7].