HBRP Publication Page 13-36 2023. All Rights Reserved Page 13 Towards Reliable Code Plagiarism Detection: A Survey on Software Clone Detection Sanjay B. Ankali 1 , Dr. S G Gollagi 2 , Dr. Bahubali M. Akiwate 3 , 1 Associate Professor, Department of Computer science & Engineering, KLE College of Engineering and Technology, Chikodi-India 2 Professor, Department of Computer science & Engineering, KLE College of Engineering and Technology, Chikodi-India 3 Associate Professor Department of Computer science & Engineering, KLE College of Engineering and Technology, Chikodi-India *Corresponding Author E-mail Id:-sanjayankali123@gmail.com ABSTRACT Despite substantial study over the past three decades resulting in the development of more than 250 clone detection technologies, there is no one framework that can accurately and reliably identify all four major types of clones. The lack of comprehensive, reliable, and language-neutral code clone detection has a significant negative influence on online learning systems like Coursera, which are unable to assess the proficiency of students in coding projects and assignments they submit to the online platforms. This survey paper can contribute to building more reliable code plagiarism detection by presenting various tools and techniques to find the same language and cross-language clone types with respect to the clone types they detect and the languages they work on. The paper highlights 3 major issues in terms of language agnostic nature and accuracy a) Most of the proposed techniques work only on a specific language like C, CPP, Java, or Python for detecting clones. b) Only 8 proposed works accurately classify all 4 basic clone types. c) 98% of the clone detection in the past is based on regular clones ignoring micro clones. The summary of the paper can provide proper directions in building a more reliable code plagiarism detection tool. Keywords:- Software clone detection, Code plagiarism, Clone types, Software Development Life Cycle INTRODUCTION The practice of producing functionally comparable codes with syntactic alteration is known as software cloning or code cloning. Alternately, it can be described as pairs of semantically related code fragments with or without syntactical modification [1]. Numerous academics use various words to refer to this process, such as duplicate code [4,5], similar code [2], same code [3]. Large legacy systems have up to 30-50% of duplicate code, respectively, according to these two papers. According to the milestone and works of literature like [6,7,8,9], there are four different sorts of code clones that fall under the syntactic category: Type 1 is commonly known as exact clones Type 2 is also known as renamed clones Type 3 is known as near-miss clones. Type 4 is functionally similar clones that are implemented differently. Different editing taxonomies provide the foundation for syntactic clones. A significant problem with earlier clone detections is that Type 4 clone detection is outside the capabilities of many outstanding clone identification techniques, like Siamese [12], Journal of Advancement in Software Engineering and Testing Volume 6 Issue 1