Copyright © 2018 Authors. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. International Journal of Engineering & Technology, 7 (4.44) (2018) 156-160 International Journal of Engineering & Technology Website: www.sciencepubco.com/index.php/IJET Research paper Open Problems in Indonesian Automatic Essay Scoring System Faisal Rahutomo 1 *, Trisna Ari Roshinta 2 , Erfan Rohadi 3 , Indrazno Siradjuddin 4 , Rudy Ariyanto 5 , Awan Setiawan 6 , Supriatna Adhisuwignjo 7 1,2,3,4,5,6,7 State Polytechnic of Malang *Corresponding author E-mail: faisal@polinema.ac.id Abstract This paper presents open problems in Indonesian Scoring System. The previous study exposes the comparison of several similarity metrics on automated essay scoring in Indonesian. The metrics are Cosine Similarity, Euclidean Distance, and Jaccard. The data being used in the research are about 2,000 texts. This data are obtained from 50 students who answered 40 questions on politics, sports, lifestyle, and tech- nology. The study also evaluates the stemming approach for the system performance. The difference between all methods between using stemming or not is around 4-9%. The results show Jaccard is the best metric both for the system with stemming or not. Jaccard method with stemming has the percentage error lowest than the others. The politic category has the highest average similarity score than lifestyle, sport, and technology. The percentage error of Jaccard with stemming is 52.31%, Cosine Similarity is 59.49%, and Euclidean Distance is 332.90%. In addition, Jaccard without stemming is also the best than the others. The percentage error without stemming of Jaccard is 56.05%, Cosine Similarity is 57.99%, and Euclidean Distance is 339.41%. However, this percentage error is high enough to be used for a functional essay grading system. The percentage errors are relatively high, more than 50%. Therefore this paper explores several ideas of open problems in this issue. The openly available dataset can be used to develop better approaches than the standard similarity metrics. The approaches expose are ranging from feature extraction, similarity metrics, learning algorithm, environment implementation, and per- formance evaluation. Keywords: Indonesian, Natural language processing, Automatic essay scoring system, Open problems. 1. Introduction Every learning process requires an evaluation to measure the level of students’ understanding. There are many types of evaluations in- clude multiple choice question, short question, and essay question. Some studies have revealed that essay question is better than others if the student’s knowledge is evaluated thoroughly [1]. But, the problem arises is time-consuming of the rating process. The teacher should read and evaluate sentence by sentence of student answer. Nowadays, many information technologies are developed to auto- mate human activities. In the education issue, the developing example is essay grading. Researchers have done research on auto- mated essays scoring (AES) since sixties years last century [2]. There are so many advantages that can be obtained in automated grading rather than in conventional grading. It is reported that teachers in Britain are spending about 30% their time in scoring student’s answers and it loses about 30 billion pounds per year [3]. So, there will be many benefits from the application of the automated essay scoring system. The application of automated essay scoring system has been devel- oped with many different methods being used. However, there is no study indicating which method is better in automated essay scoring, especially in Indonesian. The previous research [4] reveals the average errors of some methods which are commonly used in auto- mated essay scoring in Indonesian. The average errors of each method are calculated with comparing the scores from human raters and scores from the system. The methods are Cosine Similarity, Eu- clidean Distance and Jaccard. The results show Jaccard is the best approach, but the average error is still high, more than 50%. Therefore this paper exposes several ideas that can be explored further toward this issue. With the benefit of the openly available dataset in http://dx.doi.org/10.17632/6gp8m72s9p.1 [5]. Several evaluations can be done by changing the parameters, such as feature extraction, similarity metric, learning algorithm, environment implementation, and performance evaluation. This paper presentation is divided into several chapters. Chapter 1 describes the introduction. Then, Chapter 2 exposes the summary of the previous study in English, because Roshinta and Rahutomo report [4] are written in Indonesian. Chapter 3 explores further ideas and open problems toward this issue. Finally, Chapter 4 concludes this paper. 2. Indonesian essay scoring system Roshinta and Rahutomo [4] propose a web-based automated essay scoring system for Indonesian. The research also develops a dataset for performance evaluation purpose [5]. The study consists of several phases. First, developing the dataset. Inside the dataset are questioned texts with corresponding answer texts. The questions are classified into four categories: lifestyle, politics, sport, and technology. Second, develop the web-based automated essay scoring system. Third, student respondents are asked to answer the questions through web-based application system. Then, the system calculates the score with 3 methods. Fourth, the students’ answers are scored manually by 3 lecturer respondents. The final score is defined as the average score of the three respondents then served as the gold standard. Finally, the calculation of the average percentage error between manual scores and the system scores of each method.