Ranking WebPages Using Web Structure Mining Concepts Zakaria Suliman Zubi Computer Science Department Faculty of Science Sirte University Sirte, Libya Email: zszubi@yahoo.com Abstract: - With the rapid growth of the Web, users get easily lost in the rich hyper structure on the web. Providing relevant information to the users to supply to their needs is the primary goal of the owners of these websites. Web mining is one of the techniques that could help the websites owner in this direction. Web mining was categorized into three categories such as web content mining, web usage mining and web structure mining. Web structure mining plays an important role in this approach. Two page ranking algorithms such as PageRank and Hyperlink-Induced Topic Search (HITS) are commonly used in web structure mining. Both algorithms treat all links equally when distributing rank scores. A comparison between both algorithms was discussed in this paper as well. Ranking WebPages is an important mission as it assists the user look for highly ranked pages that are relevant to the query. Different metrics have been proposed to rank web pages according to their quality, and a brief discussion of the two prominent ones was conducted in this paper also. Key-Words: - Web Mining, Web Content Mining, Web Usage Mining, Web Structure Mining, HITS, PageRank, Authority and Hubs. 1 Introduction The web is a rich source of information and persists to increase in size and difficulty. Retrieving the necessary web page on the web, efficiently and effectively, is becoming a challenge aspect now days [1]. On every occasion a user needs to search the relevant pages, the user prefers those relevant pages to be at hand. Relevant web page is one that provides the same topic as the original page but it is not semantically identical to original page [1]. As a matter of fact the Web is unstructured data warehouse, which delivers the mass amount of information and also enlarges the complexity of dealing information from different perspective of knowledge searchers, business analysts and web service providers [2]. Beside, the Google report on in 2008 that there are 1 trillion unique URLs on the web [3]. Web has grown enormously and the usage of web is unbelievable so it is essential to understand the data structure of web. The mass amount of information becomes very hard for the users to find, extract, filter or evaluate the relevant information. This issue lifts up the attention to the obligation of some technique that can solve these challenges. Web mining can be easily used in this direction to carry out the problem with the help of other areas like Database (DB), Information retrieval (IR), Natural Language Processing (NLP), and Machine Learning etc. These techniques can be used to discuss and analyze the useful information from web. Dealing with these aspects, there are some challenges we should take it into account as follow [3]: 1) Web is huge. 2) Web pages are semi structured. 3) Web information stands to be diversity in meaning. 4) Degree of quality of the information extracted. 5) Conclusion of knowledge from information extracted. The paper is organized as follows- The categories of Web Mining are discussed in Section 2. Section 3 explains the important of Web Page Ranking and two important algorithms such as Hypertext Induced Topic Selection (HITS) algorithm and PageRank algorithm. In section 4, we explore the comparison between Web Page Ranking algorithms used. The Conclusion remarks are given in Section 5. 2 Web Mining Categories Web Mining consists of three main categories according to the web data used as input in Web Data Mining. (1) Web Content Mining; (2) Web Usage Recent Advances in Telecommunications, Signals and Systems ISBN: 978-1-61804-169-2 21