International Journal of Computer Science Trends and Technology (IJCST) – Volume 9 Issue 5, Sep-Oct 2021 ISSN: 2347-8578 www.ijcstjournal.org Page 86 Subjective Answer Evaluation system Alok Kumar [1] , Aditi Kharadi [2] , Deepika Singh [3] , Mala Kumari [4] Department of Computer Science and Engineering, CSJM University - Kanpur ABSTRACT This Automated text scoring (ATS) or subjective-answer evaluation is one of the big hurdles in the technical advancement of academics. Reading each answer meticulously and scoring them impartially becomes a monotonous task for many in the teaching profession, especially if the answers are long. Another big challenge is comprehending the student’s handwriting. Marking criteria may also vary largely with domain, for instance, credit is given to usage of correct grammar in some cases, while other domains may require certain keywords to be present in student’s answers. In this paper, we have tried to approach this problem with three perspectives- two standard linguistic approaches and a deep learning approach. The first approach employs the presence of certain keywords as a marking criteria and also includes a handwriting recognizer that can extract text from scanned images of the handwritten answers. The second approach uses similarity between an understudy and a benchmark answer. This paper also proposes the use of a sequential model, which is trained on the Automated Student Assessment Prize - Automated Essay Scoring (ASAP-AES) dataset for evaluating long answers. Keywords :- Natural Language Processing (NLP), Cosine Similarity, Jaccard Similarity, Synonym Similarity, Bigram Similarity, Sequential Model, LSTM, Root Mean Squared Error (RMSE) I. INTRODUCTION This document is a template. An electronic copy can be downloaded from the conference website. For questions on paper guidelines, please contact the conference publications committee as indicated on the conference website. Information about final paper submission is available from the conference website. The history of automated answer evaluation is quite long. Currently, the objective-answer evaluation systems are abundant, but it is not the same for subjective-answer evaluation. Manual answer evaluation is a very time consuming job. Not only this but it also requires a lot of manpower. Because of the obvious human error, it can sometimes be partial to few students, which is not preferred. So our system will evaluate answers using three different approaches. The motivation behind using three different approaches is to get best results in every possible domain. As all are aware that different domains require different bases for the evaluation process. For an answer written on some event of history, it is required for it to have some necessary keywords like date, place or name which is not the case with essays or other domains where main focus is on the absolute meaning. Answer evaluation or ATS is the task of scoring a text using a set of statistical and NLP measures or neural networks. Some domains may prefer the quality of answer to be the scoring criteria which highly depends on the stop words(or keywords) present in the answers. The first approach used by us revolves around the keywords present in the answer and the keywords expected to be present in the answer. It counts the matched keywords and uses it along with the length of the answer to generate the final score. The second approach simply measures the similarity score of the understudy answer when compared with the model answer. These similarity measures are cosine similarity, jaccard similarity, synonyms similarity and bigram similarity. These scores are combined together to get the final score for the answer. The third approach is mainly used to evaluate long answers using a sequence-to-vector model with stacked layers trained on ASAP-AES[22] dataset. We have elaborated the methodologies of all three approaches in section 3. The system evaluation has been done in section 4. The final results and conclusion-future work are elaborated in sections 5 and 6 respectively. II. RELATED WORKS This document is a template. An electronic copy can be downloaded from the conference website. For questions on paper guidelines, please contact the conference publications committee as indicated on the conference website. Information about final paper submission is available from the conference website. Many researchers have proposed influential and novel approaches for the task of ATS. One of the earliest essay scoring systems was proposed in Project Essay Grade[1], which used linear regression over the vector representations of the answer/text for scoring. Patil et al.[2] suggest usage of a pure linguistic approach for scoring subjective answers in text format after extracting them from scanned images of the handwritten answers. Many researchers have looked at ATS as a supervised text classification task(Rudner et al.[4], Sakaguchi et al.[5]) Landauer et al.[3], for example, proposed RESEARCH ARTICLE OPEN ACCESS