IOSR Journal of Computer Engineering (IOSR-JCE) e-ISSN: 2278-0661,p-ISSN: 2278-8727, Volume 25, Issue 3, Ser. I (May. – June. 2023), PP 01-12 www.iosrjournals.org DOI: 10.9790/0661-2503010112 www.iosrjournals.org 1 | Page A Study of Performance Evaluation and Comparison of NOSQL Databases Choosing for Big Data: HBase and Cassandra Using YCSB Nasrullah 1 , Nijad Ahmad 2 , Dr. Neeta Sharma 3 , Mohammad Saber Niazy 4 1 Dean (Faculty of Computer Science at Khurasan University, Nangarhar, Jalalabad Afghanistan) 2 Vice Chancellor (Khurasan University, Afghanistan) 3 Associate: Prof (Dept. of CSE, Noida International University Grater Noida, U.P) 4 Assistant: prof (Faculty of Computer Science at Khurasan University, Nangarhar, Jalalabad Afghanistan) Abstract: For years’ data has been a critical part of the technology and data has been perceived with the growth in technology and the population. This data is often referred to as NoSQL and unstructured data. Over the period of time, it is growing in complexity for traditional database management systems to manage some enormous databases in a virtual cloud environment. Nowadays different cloud services are offered like NoSQL databases to manage such Non-relational data which is dressing different requirements like availability, reliability, performance, safety as well as security. Hence there is a need to evaluate processes and find results in the performance and behavior of different NoSQL databases HBase and Cassandra using YCSB. The scale of big data will come from different sources it will be generated and processed from the TB level, PB level, and ZB level this range of big data processing Platforms have various rules policies guide line tools, and techniques a different level. Some common revolutions in data management systems have occurred recently, like Big Data analytics, data visualization, and NoSQL databases. They will be evaluated for different purposes, their independent developments complement each other in the given criteria. Their convergence would benefit businesses tremendously in making real-time decisions using volumes of complex data sets that could be structured or unstructured and semi-structured. several software as services solutions have emerged in supporting Big Data analytics, on the other hand, many NoSQL database packages have arrived in the market nowadays cloud offering anything as services. However, they lack independent benchmarking compression and comparative evaluation in every solution. The aim of this paper is to provide an understanding of their contexts of HBase and Cassandra performance for Big Data Analytics and an in-depth study to compare the features of two main NoSQL data models that have evolved using YCSB. Keywords: Big Data, Hadoop, NoSQL Databases, Cassandra, HBase, YCSB Application. --------------------------------------------------------------------------------------------------------------------------------------- Date of Submission: 28-04-2023 Date of Acceptance: 07-05-2023 --------------------------------------------------------------------------------------------------------------------------------------- I. INTRODUCTION Data is growing in complexity with the rise in data over the period of time 2.5 billion Bytes of data is generated every day and it is the biggest concern for every organization to handle this huge amount of data. A large amount of data is being generated from different sources like Facebook, WhatsApp, and YouTube which are the corners of the internet. This exponential data growth day to day is represented by big data. It is helping different use cases in the present-day in a data-driven environment in the virtual cloud environment and there is a need to manage it to velocity, volume, variety, and values. The traditional way of managing the databases using relational database management systems could not handle this huge amount of data because of the volume, and processing power and they are capable of storing internal data which is schema-based and only in a number of predefined formats for different data sets. The Big Data paradigm is gradually changing the present data storing techniques, processing, administration the methods of analysis in a virtual cloud environment [1]. This led to new developments and deployment in the design architecture of databases to handle big data bud due to the data processing rate and visualization rate being very low The NoSQL data do not have a fixed and free define structure or schema and it is enormous in volume. They allow storing of multiple forms of data which are structured, unstructured, or even semi-structured data for storage [2]. They store data in the form of column rows families, key-value data fairs, and document data stores. Hence NoSQL databases are designed to replace the traditional SQL DBMS. Since these databases are non-relational, the query language support is subjective. Different types of NoSQL database management systems are being used in present-day applications as they are not dependent entirely on queries for processing storing and data management. [3][4][5] These databases are extensively used in environments where data do not rely on a relational model and structure data. There are different NoSQL