International Journal of Computer Applications Technology and Research Volume 4– Issue 7, 529 - 534, 2015, ISSN:- 2319–8656 www.ijcat.com 529 A Review of Data Access Optimization Techniques in a Distributed Database Management System Sadiq Mobolaji Abubakar Department of Computer Science University of Port Harcourt Nigeria Anyama Oscar Uzoma Department of Computer Science University of Port Harcourt Nigeria Adamade Peter Simon Department of Computer Science University of Port Harcourt Nigeria Abstract: In today's computing world, accessing and managing data has become one of the most significant elements. Applications as varied as weather satellite feedback to military operation details employ huge databases that store graphics images, texts and other forms of data. The main concern in maintaining this information is to access them in an efficient manner. Database optimization techniques have been derived to address this issue that may otherwise limit the performance of a database to an extent of vulnerability. We therefore discuss the aspects of performance optimization related to data access in distributed databases. We further looked at the effect of these optimization techniques. Keywords: Data; Distributed database; Performance; TSQLANN 1. INTRODUCTION It’s a known fact that the amount of data that enterprises are storing and managing is growing rapidly. Industry estimates indicate that data volume is doubling every 2-3 years. The rapid growth of data presents frightening challenges for IT, both in cost and for our study, performance. Although the cost of storage keeps declining, fast-growing data volumes make storage one of the costliest elements of most IT budgets. In addition, the accelerating growth of data makes it difficult to meet performance requirements while staying within budget. When a database based application performs slowly, there is a 90% probability that, the data access routines of that application are not written in the best possible way or optimized, In this paper we will discuss Data access performance optimization in transactional SQL Server databases and will also consider the performance of a very large database with and without our suggested optimization. Though the optimization techniques are suggested for transactional SQL Server databases but most of the techniques are roughly the same for other database platforms. In oracle 12c, Automatic Data Optimization (ADO) automatically moves and compresses data according to user-defined policies based on the information collected by Heat Map, [1]. 1.1 Performance Optimization Techniques It is worth mentioning here that all forms of optimization actually enhances the performance of the database and below are some of the techniques employed. Indexing in the table column: We need to create primary key in every table of the database. When we create a primary key in a table, a clustered index tree is created and all data pages containing the table rows are physically sorted in the file system according to their primary key values. Each data page contains rows which are also sorted within the data page according to their primary key values. [2], pointed that each time any row from the table is asked for; the database server finds the corresponding data page first using the clustered index tree and then finds the desired row within the data page that contains the primary key value. The intermediate nodes contain range of values and direct the SQL engine where to go while searching for a specific index value in the tree starting from the root node. The leaf nodes are the nodes which contain the actual index values. If this is a clustered index tree, the leaf nodes are the physical data pages. If this is a non-clustered index tree, the leaf nodes contain index values along with clustered index keys (Which the database engine uses to find the corresponding row in the clustered index tree). Usually, finding a desired value in the index tree and jumping to the actual row from there takes an extremely small amount of time for the database engine. So, indexing generally improves the data retrieval operations which a performance enhancement strategy, [3]. Movement from application into the database server of SQL Codes: Moving the SQLs from application and implementing these using stored procedures/ Views/ Functions/ Triggers will enable us to eliminate any duplicate SQLs in our application. This will also ensure reusability of our TSQL codes. Implementing all TSQLs using the database objects will enable us to analyze the TSQLs more easily to find possible inefficient codes that are responsible for slow performance. Also, this will let us manage our TSQL codes from a central point, [4]. Doing this will also enable us to re-factor our TSQL codes to take advantage of some advanced indexing techniques. Also, this will help us to write more “Set based” SQLs along with eliminating any “Procedural” SQLs that we might have already written in our application. Despite the fact that indexing will let us troubleshoot the performance problems in our application in a quick time, following this step might not give us a real performance boost instantly. But, this will mainly enable us to perform other subsequent optimization steps and apply different other techniques easily to further optimize our data access routines. 1.2 Covering Index: If we know that our application will be performing the same query over and over on the same table, we should consider creating a covering index on the table. A covering index,