Journal of Survey in Fisheries Sciences 10(3) 735- 752 2023 735 Analysis of Clusters With Indian Patent Data Using Different Word Embedding Techniques Received: 4 Sep, 2023 Revised: 12 Nov, 2023 Accepted: 1 Dec, 2023 Pankaj Beldar 1* , Mohansingh Pardeshi 2 , Rahul Rakhade 3 , Shilpa Mene 4 1* K.K.Wagh Institute of Engineering Education and Research - Email:- prbeldar@kkwagh.edu.in 2 K.K.Wagh Institute of Engineering Education and Research Email:- mrpardeshi@kkwagh.edu.in 3 K.K.Wagh Institute of Engineering Education and Research Email:- rdrakhade@kkwagh.edu.in 4 K. K. Wagh Institute of Engineering Education & Research,Nashik Email:- spmene@kkwagh.edu.in *Corresponding Author: Pankaj Beldar *K.K.Wagh Institute of Engineering Education and Research prbeldar@kkwagh.edu.in Abstract. This study employs advanced Unsupervised Machine Learning (UML) techniques, including K-means and Agglomera- tive clustering, to analyze descriptive Indian Patent data. Utilizing silhouette score evaluation, elbow method, and den- drogram analysis, optimal cluster numbers are determined. Various word embedding methods like TF-IDF, Word2Vec, and Countvectorizer, combined with rigorous text processing, are explored. Robust testing of categorical and numerical features yields a high silhouette score of 0.8965 for 2 clusters, showcasing Agglomerative clustering's effectiveness. The research emphasizes the crucial role of UML techniques, word embedding methodologies, and comprehensive text pro- cessing in revealing complex structures within Indian Patent data. Besides advancing unsupervised learning methodolo- gies, this work aids scholars, practitioners, and policymakers in comprehending the Indian patent landscape, fostering innovation, and technological progress. Keywords: K-means, Agglomerative clustering, Word embedding, Patents, Silhouette Score 1 Introduction In a time of invention and rapid technological development, intellectual property rights (IPR) and patents are essential for preserving the arts, promoting development, and propelling global economic expansion. In the complex web of in- ternational economies, India stands out as a country where the significance and effect of intellectual property rights, especially patents, have changed dramatically. In the socioeconomic environment of India, the value of intellectual property rights which include patents, copyrights, trademarks, and trade secrets cannot be emphasized. In recent dec- ades, India has experienced a growing focus on innovation in a variety of fields, which has raised awareness of the need of protecting and utilizing intellectual property. The objective of this study paper is to examine the various aspects of intellectual property rights (IPR) and patents in Indian society, clarifying their significance, ramifications, and the changing dynamics within the nation's socio-economic structure. In order to decipher the complex web of effects and outcomes surrounding intellectual property rights in India, this study will look at the subtle interactions that exist be- tween IPR laws, innovation, entrepreneurship, and national development. The study will examine the legislative frame- work, policy interventions, and their effects on innovation and economic growth as it moves through the historical de- velopment of India's intellectual property landscape. This article will also include case studies and actual data to high- light the concrete effects of IPR and patents on a variety of industries, including the Indian creative industries, technolo- gy, pharmaceuticals, and agriculture. The use of natural language processing (NLP) makes it easier to extract important data from policy frameworks, patent databases, and legal papers. This helps identify new technologies, innovation pat- terns, and how Indian IPR laws are changing. In order to better comprehend the implications and interpretations of IPR laws and patent applications in the Indian socio-economic context, researchers can do sentiment analysis, entity recogni- tion, summarization, and categorization of legal texts by using NLP-powered algorithms. Furthermore, by utilizing word embedding methods like Word2Vec, GloVe, or BERT, words and phrases can be represented as high-dimensional vec- tors that capture contextual meanings and semantic relationships found in the textual corpus. With the use of these methods, scholars can investigate the semantic parallels, groups, and connections found in legal and patent texts, ena- bling a more sophisticated examination of the terminology found in Indian IPR frameworks, court rulings, and patent