JOURNAL OF INFORMATION TECHNOLOGY AND ITS UTILIZATION, VOLUME 3, ISSUE 1, JUNY-2020, 9-13 ISSN 2564-802X 9 RABIN-CARP IMPLEMENTATION IN MEASURING SIMALIRITY OF RESEARCH PROPOSAL OF STUDENTS Herman 1 , Lukman Syafie 2 , Tasmil 3 , Muhammad Resha 4 12 Computer Science Dept. Universitas Muslim Indonesiaut, Indonesia 3 Balai Besar Penelitian dan Pengembangan SDM KOMINFO Makassar, Indonesia 4 STMIK AKBA Makassar, Indonesia 1 herman@umi.ac.id, 2 lukman.syafie@umi.ac.id, 3 tasmil@kominfo.go.id, 4 m.resha@gmail.com Abstract-- Plagiarism is the use of data, language and writing without including the original author or source. The place where palgiate practice occurs most often is the academic environment. In the academic world, the most frequently plagiarized thing is scientific work, for example thesis. To minimize the practice of plagiarism, it is not enough to just remind students. Therefore we need a system or application that can help in measuring the level of similarity of student thesis proposals in order to minimize plagiarism practice. In computer science, the Rabin-Karp algorithm can be used in measuring the level of similarity of texts. The Rabin-Karp algorithm is a string matching algorithm that uses a hash function as a comparison between the search string (m) and substrings in text (n). The Rabin-Karp algorithm is a string search algorithm that can work for large data sizes. The test results show that the use of values on k-gram has an effect on the results of the measurement of similarity levels. In addition, it was also found that the use of the value 5 on k-gram was faster in executing than the values 4 and 6. Keywords: plagiarism; thesis proposal; rabin-karp. I. INTRODUCTION The rapid development of information technology makes its users easily obtain data and information. Data and information can be in the form of text, images, audio or video. However, this has a negative impact, which is easily plagiarism. According to [1], the problem of plagiarism is increasing because of the digital era. Plagiarism is the use of data, language, and writing without including the original author or source [1]. According to the Regulation of the Minister of National Education of the Republic of Indonesia No. 7 of 2010, plagiarism is the act of intentionally or unintentionally in obtaining or trying to obtain credit or value by quoting part or all of the scientific work of others without stating the exact and adequate source. According to [2], the academic world is the place where plagiarism is most common. Indonesia, a developing country, is not the only one affected by palgiarism. According to [3], several cases of palgiat were also found in developed countries. The place where palgiate practice occurs most often is the academic environment. In the academic world, the most frequently plagiarized thing is scientific work, for example thesis. To minimize the practice of plagiarism, it is not enough to just remind students. Therefore we need a system or application that can help in measuring the level of student thesis similarity in order to minimize the practice of plagiarism[3]. In computer science, the Rabin-Karp algorithm can be used in measuring the level of similarity of texts. The Rabin-Karp algorithm is a string matching algorithm that uses a hash function as a comparison between the search string (m) and substrings in text (n). If the values of the two hashes are the same then a further comparison will be made of the characters. Comparisons are made from left to right (n-m) times [4]. According to [5], that the Rabin-Karp Algorithm is a string search algorithm that can work for large data sizes. It is on this basis that the Rabin-Karp Algorithm is used in this research. The use of the Rabin-Karp Algorithm has been carried out by [6]. In that study the Rabin-Karp Algorithm is used to optimize heterogeneous solutions that take into account the benefits and limits of achieving Online Analytical Processing (OLAP) in terms of response time. The results of the study showed better performance in terms of response time and memory usage. Some researchers have conducted research related to string matching, including [1] using architecture and algorithms to detect case-copy or plagiarism. Dayarathne and Ragel [5] conducted research using the Rabin-Karp Algorithm in finding the appearance of a pattern in the text of a collection of strings. The same thing was done by [7] and [8], using the Rabin-Karp Algorithm in matching text or string patterns. Rasywir et.al [9] use the Rabin-Karp Algorithm to automatically evaluate students' essay answers. On this basis, the researcher used the Rabin-Karp algorithm to measure the level of similarity of student research proposals. II. METHODE The proposed system design of this study consists of 2 parts, namely data collection and measurement of the level of similarity. The first step taken is to collect data on student proposals that already exist in the database and then proceed to measure the level of similarity of student research proposals that are entered and compare with student research proposal data that has been stored in a database. Proposal data in the form of text that will be stored in a database or similarity level must be measured through preprocessing, where this process