Abstract—Advent enhancements in the field of computing have increased massive use of web based electronic documents. Current Copyright protection laws are inadequate to prove the ownership for electronic documents and do not provide strong features against copying and manipulating information from the web. This has opened many channels for securing information and significant evolutions have been made in the area of information security. Digital Watermarking has developed into a very dynamic area of research and has addressed challenging issues for digital content. Watermarking can be visible (logos or signatures) and invisible (encoding and decoding). Many visible watermarking techniques have been studied for text documents but there are very few for web based text. XML files are used to trade information on the internet and contain important information. In this paper, two invisible watermarking techniques using Synonyms and Acronyms are proposed for XML files to prove the intellectual ownership and to achieve the security. Analysis is made for different attacks and amount of capacity to be embedded in the XML file is also noticed. A comparative analysis for capacity is also made for both methods. The system has been implemented using C# language and all tests are made practically to get the results. Keywords—Watermarking, Extensible Markup Language (XML), Synonyms, Acronyms, Copyright protection. I. INTRODUCTION ITH the advancement in telecommunication and computing a rapid growth of electric documents processing on Internet is evolving every day. This perhaps has evolved and matured the concepts of e-business, e-commerce and e-learning. Electronic publishing has played a significant role in the field of internet technologies and web-development. But this has also evolved the subject of Information Security in recent years very significantly. With the great enhancements in the field of information security the three basic principles (Confidentiality, Integrity and Availability) are doubled and more principles (Possession, Authenticity and Utility) are also added in the security model. Un-authorized use of text by copying from the internet has become a common practice and has a great effect on the Nighat Mir is with the Department of Computer Science, College of Engineering, Jeddah, Kingdom of Saudi Arabia. phone:966-2-6364300-2324; fax:966-6377447; email:nmir@effatuniversity.edu.sa. Dr. Sayed Afaq Hussain is with the Department of Computer Engineering, Riphah International University, Islamabad, Pakistan; email: drafaqh@gmail.com. privacy of data. Electronic documents are exposed to various threats like copying, redistributions, destruction, forgery and tampering of data. Copyright protection is no more enough for the electronic documents as copying and manipulating information is not difficult. Digital Watermarking methods are considered a strong mechanism to identify the original owner and to prove the intellectual property. Watermarking is a branch of information security in which additional ownership information like name, logo, ISBN or signature is added to the content. This can be applied to any digital media like audio, video, image or text to prohibit the un-authorized use and duplication. Various methods have been studied and applied for the multimedia objects but a few for text or electronic text without altering its integrity. In Digital watermarking a hidden marker is embedded to the data which is generally un-observable and can be only drained by special detector. The main aim of digital watermarking to use human’s insensitive perceptual organs and it does not change the basic characteristics. [1] With the ever increasing growth of internet users all over the world, it is very important to secure the web pages. There is a wide bandwidth present in web pages for information hiding and many robust techniques can be developed for web page watermarking. Web page watermarking is to achieve the integrity of web pages which is a very popular and rich source of information. HTML and XML are main tools for web development. Even scripting code is also translated by the browser into HTML format at the end. XML files are used to exchange information on internet and are very sensitive for the owners [2]. Due to its sensitivity, importance of XML security is growing everyday and different techniques have been developed for its integrity. Due to the big amount of data published in the form of XML its protection is becoming an important requirement. Watermarking scheme for XML files should be based on the usability of data and the underlying semantics like key attributes and functional dependencies [3]. II. RELATED WORK Qijun Zhao, Hondtao Lu [4] have proposed scheme for the tamper proof web pages in which watermarks are generated on the basis of the Principal Component Analysis (PCA) technique. These watermarks are then embedded in HTML Web Page Watermarking: XML files using Synonyms and Acronyms Nighat Mir, Sayed Afaq Hussain W World Academy of Science, Engineering and Technology International Journal of Computer and Information Engineering Vol:5, No:1, 2011 69 International Scholarly and Scientific Research & Innovation 5(1) 2011 ISNI:0000000091950263 Open Science Index, Computer and Information Engineering Vol:5, No:1, 2011 publications.waset.org/1963/pdf