International Journal of Future Generation Communication and Networking Vol. 10, No. 11 (2017), pp.19-36 http://dx.doi.org/10.14257/ijfgcn.2017.10.11.03 ISSN: 2233-7857 IJFGCN Copyright © 2017 SERSC Australia Discovery of Entity Synonym Using Anchor Text and URLs 1 Mamta Kathuria 1 , Anurahda Singh 2 , C. K. Nagpal 3 and Neelam Duhan 4 1 Assistant Professor, YMCA University of Science & Technology, Faridabad (India) 2 Student (M.Tech), YMCA University of Science & Technology, Faridabad (India) 3 Professor, YMCA University of Science & Technology, Faridabad (India) 4 Assistant Professor, YMCA University of Science & Technology, Faridabad (India) 1 mamtakathuria@ymcaust.ac.in, 2 anuradhasngh13@gmail.com, 3 nagpalckumar@rediffmail.com, 4 neelam.duhan@gmail.com Abstract In the current scenario, the web queries have become more and more pin-pointed so as to find results relating to specific entity in a specific context of time, place, etc. For example, information pertaining to a movie-show, a particular train, newspaper of a particular date, performance of a particular stock etc. All these references associated with a particular entity are known as entity references. The problem with these references is that they vary with the heterogeneous contexts of the web and one may not be getting the required answers to his/her query owing to these varied entity references known as entity synonyms. These entity synonyms cannot be handled through lexical resources like WordNet [1]. Therefore, every search engine will have to create its own mechanism for finding the entity synonyms of a particular entity in order to properly answer the users queries, the process being known as entity resolution. In recent past, many researchers have tried to devise the mechanisms to generate the entity synonyms. This paper is also an effort in this direction and creates a rich set of entity synonyms for a given entity using inbound anchor text and URLs. Keywords: Entity, Candidate Entity Synonym, Web Query, Inbound Anchor text, Entity Synonym Extraction 1. Introduction With the growth of the web, users have been making diverse forms of queries relating to a variety of domains concerned with daily life issues. These queries associated with products, brands, recipes, weather forecast, show timings, quotes for various products etc. are being searched by common users to accomplish their daily needs. These searches related to entities can be best sorted out from the latest product catalogs and associated databases if the references are specific. However, if the references are general and refer to common entities, then product catalogs and databases may not be available. When the entities are well known, the synonyms can be found by the usage of sources like Wikipedia [2] and FreeBase [3]. For instance, the Bhabha Atomic research Center may be referred as Bhabha Institute, BARC, Atomic Energy Center, and Nuclear Energy Center etc. However, for the common entities these online resources may not work. The problem with these generic entities is their multiple types of references to the same entity due to different creators of the web pages. For example, a paper like The Hindustan Times may be referred as The HT, HT, The Hindustan, The Hindustan Times Today, and Received (May 15, 2017), Review Result (August 31, 2017), Accepted (September 20, 2017)