International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 09 Issue: 04 | Apr 2022 www.irjet.net p-ISSN: 2395-0072 © 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 3610 Calculating Rank of Web Documents Using Its Content and Link Analysis Amit Kumar 1 , Anshita Bhardwaj 2 , Anshika Jain 3 , Mr. Jagbeer Singh 4 1,2,3,4 - Department of Computer Science and Engineering, Meerut Institute of Engineering and Technology, Meerut- 250005, Uttar Pradesh, India ---------------------------------------------------------------------***--------------------------------------------------------------------- Abstract - On the World Wide Web (www), when a query is searched by the user over a search engine, ranking is the way through which the importance of web pages is measured by a search engine. In today’s scenario, all the vital information is available online in the form of text documents. Various search engines are available for mining this available information, according to the user query, and giving appropriate and most relevant results to the user following his/her query. Search engines retrieve and show the documents according to their ranking. There are many search engines following page ranking for assignment of the weightage to the website’s pages. In this paper, content-based matching is done along with the page ranking on hyperlink evaluation to display more accurate and relevant results following the user query. Key Words: Hyperlink evaluation, Ranking, Search engine, Search query, content-based. 1. INTRODUCTION Nowadays, the Page-Rank method is mostly used in biblio- metrics[7], information networks, social analysis, and link prediction. It is also used for systems analysis of road networks and in Science, and neuroscience. The main factor is that it does not matter how long the query is, the answer will always come out in a particular order of links. Page- Rank seems very simple. But when a simple calculation is applied thousands or millions of times over the results can seem complicated. The main purpose of this paper is to provide an effective way to get the query result by using very simple code for clarity and understanding. The future work for starters can be, that we need to optimize our method by creating what our target audience wants to see. This will attract links better than anything else. A search query is a string of words a user enters in the search box, and then the search engine gives the response within sub-seconds. A search engine is an online application that gets a query input from the user and based on the keywords or catchphrases received by the user, it fetches the results by online crawling [8] the websites with the help of crawlers or spiders, and then sorts them to make a list of hyperlinks corresponding to the matched documents. In this paper, Along with the content-based matching, page ranking on hyperlink evaluation is done to display more accurate and relevant results following the user query. First, we have fetched out the links along with the content present inside the topmost text documents and pasted them inside a dictionary to evaluate a score to give the most relevant webpage, then the score is calculated for every document and a tagged score is assigned to each of them. After that, the highest score is found to get the best top pages reordered to improve user-fetched results on the search engine. The responsive sequence of lists is also known as the Search Engine Result Page(SERP). The sequence of responses provided by search engines may consist of a mix of videos, images, articles, web pages, and many other types of files. The ranking of Web pages returned in response to a user query combines a measure of the relevance of the page to the query together with a query-independent measure of the quality of the page. The objective of this project is to reduce the uncertainty and un-usefulness of the web pages that come up at the top of the desired results by using both link and content analysis. 2. BACKGROUND HISTORY The web pages shown at the top of the search results by the search engine are at times unwanted or useless for the user through certain practices. Mainly, web document retrieval has three types which are explained as: 2.1 Organic Search Organic search is termed as the search methodology by which the search pages are retrieved through the search engine’s algorithm. In the search engine's algorithmic test, web pages scoring exceptionally well are generally containing algorithms based upon factors such as quality and suitability of the content, specialization/expertise, authoritativeness, and trustworthiness of the website and its respective content writer on the given topic. Usually, the organic search results are the ones which are unpaid results appearing extensively over a search engine when the results page are popped up after the query gets searched by the user. For the sake of a relevant example, when user types "South Indian food" in any search engine, say, Google, there are all the unpaid results flashing which are all a part of the organic search. Commonly, people tend to view and open up the topmost results on the first page of all the search results. Each page of the search engine results, usually contains 10 organic listings[1,2], however, some results pages may have