Enhancing the Performance of Semantic Search in Bengali using Neural Net and other Classification Techniques Arijit Das1, Diganta Saha1 1Department of Computer Science and Engineering, Faculty of Engineering and Technology, Jadavpur University, Kolkata, West Bengal, 700032 INDIA Corresponding author: Arijit Das (e-mail: arijit.das@ieee.org). Abstract Search has for a long time been an important tool for users to retrieve information. Syntactic search is matching documents or objects containing specific keywords like user- history, location, preference etc. to improve the results. However, it’s often possible that the query and the best answer have no term or very less number of terms in common and syntactic search can’t perform properly in such cases. Semantic search, on the other hand, resolves these issues but suffers from lack of annotation, absence of WordNet in case of low resource languages. In this work, we have demonstrated an end to end procedure to improve the performance of semantic search using semi-supervised and unsupervised learning algorithms. An available Bengali repository was chosen to have seven types of semantic properties primarily to develop the system. Performance has been tested using Support Vector Machine, Naive Bayes, Decision Tree and Artificial Neural Network (ANN). Our system has achieved the efficiency to predict the correct semantics using knowledge base over the time of learning. A repository containing around million sentences, a product of TDIL project of Govt. of India, was used to test our system at first instance. Then the testing has been done for other languages. Being a cognitive system it may be very useful for improving user satisfaction in e-Governance or m-Governance in the multilingual environment and also for other applications. Keywords: Semantic Search, Deep Learning, SVM, Naive Bayes, Neural Network, Decision Tree I. INTRODUCTION Semantic Search has been around for quite some time and has gained widespread use due to its applications and promising results. Most of the developing countries are multilingual. The emerging economies of the world also use more than one official language for communication. India, the largest multilingual democracy, has 22 languages which have official recognition in the constitution and gets encouragement from the government to promote. In India, there are also 122 languages which are being spoken by more than ten thousand people and defined as major languages. Except these 1599 other languages also exist in India which are used by a very small portion of the population. India has seventy percent rural population and the majority of them are only proficient in their mother tongue. They prefer to use native language over the internet or in another way it can be said that they use the internet more for all e-Governance application if the content is available in their mother language. Search is one of the major operations which is done frequently by internet users. Let’s see some case studies where “Semantic Search” or “Contextual Meaning” prevails in case of different language domain. Let some Bengalee person (people of West Bengal, India or Bangla Desh whose mother tongue is Bengali) needs to reset his watch, so he wants to know the accurate time over the web and gives a search "কটা বাাজ?”/katā bāje?/ ”What is the time now?” As of 06.07.2019 at 15:43 google, bing, yahoo all fail to give the answer either they are showing blank result or giving some pages which have the term "কটা বাাজ? /katā bāje?/. But the searcher who does not know the English language (let) wants to know the time, so search result should include local time, GMT etc. The search engine needs to understand the meaning or context of the searchers’ search query. For a smart search engine such queries should point to the same answer for the query “what is the time now?” but search engines fail to understand the meaning of the query, therefore, cannot retrieve the current local time or Greenwich Mean Time. Citizens’ feedback is one of the most important pillars of good governance. E-governance makes the task of giving feedback easy and affordable. Giving input in the native language is easy nowadays with the soft keyboard available in their native languages. But if a question is asked in the Nepali language and the answer is present in the Portuguese language, the system fails to retrieve the