International Journal of Computer Applications (0975 – 8887) Volume 106 – No.7, November 2014 18 A Cluster based Probabilistic Model for Link Prediction to Improve User Interface over Internet Vivek Rawat Research scholar SIRT Bhopal Sumit Vashishtha Assistant professor SIRT Bhopal ABSTRACT Rapid growth of web application has increased the researcher’s interests in today’s world. The world hasbeen surrounded by the computer’s network. There exists a very useful application call web application that is used for the purpose of communication and data transfer. An application that is accessed with the help of web browser over a network is called as the web application. Web caching is considered to be the well-known strategy for improving the performance of Web based system. This performance is improved by keeping the Web objects that are likely to be used in the near future in location that is closer to user. The Web caching mechanisms therefore are implemented at three levels namely: (i) client level, (ii) proxy level and (iii) original server level. Significantly, proxy servers play the vital roles between users and web sites in reducing the response time of user requests as well as saving of network bandwidth. Thus, for achieving the better response time, an efficient caching approach must be implemented in a proxy server. This paper further includes weighted rule mining concept, cluster based link prediction and Markov model for fast and frequent web pre fetching. Keywords Web Services, Pre-fetching, Log file, cluster 1. INTRODUCTION Web is a key resource in order to share the information along the world. It has large number of news, advertisements, global connectivity between people and lots of knowledge for the students. This massive use of Web or WWW makes it more important in the world of research. Researcher has the challenge to make the web applications more efficient. Many researchers work on it and give new idea in order to give the better results from the previous one. This dissertation is also puts its best foot forward in this era [1,12,13]. There is a huge need to improve the response time of server for web applications. Current Web has a massive repository due to increase its use suddenly. It has to focus on both the quality and quantity of web contents. Even, when the speed of Internet has improved with the reduced costs, the traffic is getting heavier to a large extend. The information which is enormous makes it difficult to find the relevant information quickly. This led to the effort to improve the speed, by reducing the latency that makes the web more relevant and more meaningfully connected.[2,8,9] . The Cache prefetching plays an important role in order to enhance the response time and make the application well- organized. The web prefetching is a technique which is usedin order to preprocess the requests of the user, before they are actually demanded. Therefore, the time that the user must wait for the documents that is requested can be reduced by hiding the request latencies. Pre-fetching is the method for reducing Latencies. The user always expects an interactive response, better satisfaction and quality of output. There are various approaches and algorithms have been proposed for improving the web performance [3, 10, 11]. The proposed work will use to predict fourth coming link to improve the user experiences and expedites users visiting speed. Predictive Web pre-fetching or link prediction refers to the method of deducing the upcoming page accesses of a client based on its past experience. In this work we demonstrate the frequent mining pattern which is obtain on the basis of input and on the basis of that caching and pre- fetching ratio is calculated. Thus we present a new idea for the interpretation of Web pre-fetching and web caching from the given usage items. The approach works on the basis is web mining with the combination of clustering approach. This paper is divided into seven sections. First one is introduction in which give the brief description of work. The second section discusses the previous work related to the topic. The third section describes the approach used in the presented work. The next section describes about the proposed architecture of the presented work. After this the simulation result has discussed. Finally paper concludes in the section eight’s. 2. PREVIOUS WORK Research over web mining and web pre-fetching is going very fast in last decades. Toufiq Hossain Kazi et.al [4 ] gives an Adaptive Resonance Theory (ART) based on pre-fetch technique namely ART1, use the bottom-up and top-down weights of the cluster-URL connections obtained from a modified ART1 algorithm to make pre-fetching decisions. A.B.M.Rezbaul Islam et.al [6] proposed a new and improved FP tree with a table and a new algorithm for mining of the association rules. This algorithm further mines all possible frequent item set without generating the FP tree which is conditional in nature. It also provides the frequency of frequent items, which is used to estimate the desired association rules, Whereas P. Sampath et.al[5] present an weight estimation process with span time, request count and access sequence details. The user interest based page weight is used to extract the frequent item sets. Systolic tree is used to arrange candidate sets with frequency values. Due to the limited size of the systolic tree, a transactional database must be projected into smaller ones each of which can be mined in hardware efficiently. A high performance projection algorithm which fully utilizes the advantage of FP-growth is proposed and implemented. It reduces the mining time by partitioning the tree into dense and spare parts and sending the dense tree to the hardware. Systolic tree based rule mining scheme is enhanced for weighted rule mining process. Automatic weight estimation scheme is used in the system. With explosively growing number of Web contents including Digitalized manuals, emails pictures, multimedia, and Web services require a distinct and elaborate structural framework that can provide a navigational surrogate for clients as well as for servers. Due to the increasing amount of data Available