Short Paper—Plagiarism Detection Process using Data Mining Techniques Plagiarism Detection Process using Data Mining Techniques https://doi.org/10.3991/ijes.v5i4.7869 Mahwish Abid ! ! " , Muhammad Usman, Muhammad Waleed Ashraf Riphah International University Faisalabad, Pakistan. mahwish.abid15@gmail.com Abstract—As the technology is growing very fast and usage of computer systems is increased as compared to the old times, plagiarism is the phenome- non which is increasing day by day. Wrongful appropriation of someone else’s work is known as plagiarism. Manually detection of plagiarism is difficult so this process should be automated. There are various tools which can be used for plagiarism detection. Some works on intrinsic plagiarism while other work on extrinsic plagiarism. Data mining the field which can help in detecting the pla- giarism as well as can help to improve the efficiency of the process. Different data mining techniques can be used to detect plagiarism. Text mining, cluster- ing, bi-gram, tri-grams, n-grams are the techniques which can help in this pro- cess. Keywords—Plagiarism, Paraphrasing, Data mining, Text mining, MDR, tri- gram, n-gram, Clustering, Similarity, Intrinsic plagiarism, Extrinsic plagiarism 1 Introduction In this modern time, with the advancement of internet, easy availability of the computers over the globe has made it easy to access other’s work which results in plagiarism. Plagiarism is known as the act of using someone else work without the information of author or without giving acknowledge to that corresponding person [1]. With the advancement of technology, use of computers is growing very vastly and it can be seen that they are used everywhere in schools, institutes and industries. More often, assignments of students are submitted in electronic forms. As e-form is easy and suitable for teachers and students as well, but it leads towards the easy opportuni- ty of plagiarism. With the widespread of information over the globe, it is very easy to copy the data from different sources which includes internet, papers, books over the internet, newspapers etc. and paste it in a single work without giving any acknowledge to the sources. These actions lead towards lack of learning in students. So there is a need of detecting the plagiarism to increase and improve the learning of students [2]. Plagiarism can occur in any type of field e.g. novels, program’s source codes, re- search papers and etc. Furthermore, there can occur in numerous situations when 68 http://www.i-jes.org