© 2018, IJCERT All Rights Reserved DOI: http://dx.doi.org/10.22362/ijcert/2018/v5/i1/V5I103 12 International Journal of Computer Engineering in Research Trends Multidisciplinary, Open Access, Peer-Reviewed and fully refereed Research Paper Volume-5, Issue-1 ,2018 Regular Edition E-ISSN: 2349-7084 Investigation of Mining Association Rules on XML Document P.M. Gavali 1 1* Computer Science and Engineering, D.K.T.E. Society’s Textile and Engineering Institute, Ichalkranji , India e-mail: gavalipm87@gmail.com Available online at: http://www.ijcert.org Received: 10/01./2018, Revised: 17/01/2018, Accepted: 18/January/2018, Published: 02/February /2018 Abstract:- XML is globally accepted format for sending the data on internet and between different applications which are running on different platforms and architectures. Due to this, the huge amount of data on the internet is in XML. Thus researchers are attracted toward XML to identify interesting findings and patterns from these documents. Many data mining algorithms have been applied to XML including clustering, classification and association rules. In this paper association, rule mining on XML document is studied. This can be used to identify what work is done in the stated field and how we can extend it further in future. Keywords: XML, Data Mining, Association rules. ------------------------------------------------------------------------------------------------------------------------------------------------------- 1. Introduction Nowadays XML (eXtensible Markup Language) is widely used format for exchanging data on the internet as it is portable. Therefore most of the data available on the internet are in XML. So it became essential to process such XML documents to identify hidden and useful information from XML. In this direction, most of worked is carried out by applying various data mining algorithms. In this paper, we are concentrating on mining association rule on XML document. Rest of the paper is organized as: In point two, various techniques to identify association rule on XML used by researchers are discussed. In point three, work of various researchers is explained in detail. In point, four conclusions are given. 2. Broad Categories of Mining Association Rule on XML document Researchers used different techniques to mine association rules from different categories. Mining association rule on XML document is broadly divided into four groups as shown in figure 1. Figure 1 Broad Categories of ARM on XML Document The tree-based approach creates a tree from XML document to simplify the identification of association rules. On this tree, basic tasks can be performed like identifying frequent itemsets and providing the abstract layer between original XML document and querying system. While structure-based method considers the structure of original XML document and their relationship with other elements in XML document to form rule. It can also be used to track the changes made in the XML document. Some of the researchers used the variants of the table for identifying transactions from the XML document which formed the base for determining frequent items identification and identifying association rule. While parsers are used to check the quality of underlying data across DTDs and XML