Volume 66, Issue 1, 2023 Journal of Scientific Research of The Banaras Hindu University 1 DOI: Abstract: Evaluation of descriptive answers is important for analyzing the growth of students. It may be helpful for a job interview, for academic purposes, and in many more fields. In this research we discussed the importance of evaluating descriptive answers for analyzing student growth and how it is useful in various fields. With the increase in online exams due to the pandemic, objective-type questions are evaluated through different software, but there is a lack of system for evaluating descriptive answers. As manual evaluation is time-consuming, the probability approach is used in this research, which is compared with a pre-trained model and cosine similarity approach. In this research, we have used a probability approach, a pre-trained model, a cosine similarity approach, and compared it with a manually assigned score by a subject expert. The analysis concludes that the probability approach provides efficient results compared to other methods. Index Terms: Cosine Similarity, Descriptive answer, NLTK, Probability Approach, Similarity Score. I. INTRODUCTION During the COVID pandemic situation, we have new experiences to familiarize ourselves with online exams. In the education sector, there is a lot of online student data. It may be their Google forms, assignments for examination purposes. The examination part plays a vital role in the student’s academic phase. Because of the huge amount of data, it is important to handle it with a proper system. In the pandemic situation, many institutions shifted their examinations online too. Objective type questions are easy to evaluate, and they can be evaluated automatically with correct results. But the main purpose of the exam is knowledge understood by students. Descriptive answers may be helpful in checking overall student growth, progress, and positive change. But evaluation of descriptive answers is difficult through online mode. It is a lengthy textual answer given by students, and it will become difficult for the examiner to evaluate dozens of student submissions. It may get biased while checking the numbers on the paper or towards some students at the same time as comparing the solutions. To formulate scores obtained by students, we have used Natural Language Processing (NLP), a certain existing tool, and a probability approach. II. OBJECTIVE The main objective of this research is to use the concept of text analysis through a probabilistic approach, a pre-trained model, and a cosine similarity approach to accurately evaluate descriptive answers in an online mode. Here we use three techniques to score the student answer. Further, we compare those scores with scores given by a teacher or subject expert. Our purpose is to find out which method gives better results. III. DATA PRE-PROCESSING For this study, we collected responses from students who had basic ideas about the experiment. To gather answers from students, we ask the question, "What is the deterministic experiment?" We have collected 129 samples of data through a Google Form. Online-collected data is not structured. There is a need for structured data to apply the additional tools. For comparison purposes, we need the ideal score, which we find out manually through subject experts. Using NLTK, the conversion of primary data into structured format is done. A. Lower If the textual content is in the same case, it is easy for a device to interpret the phrases because the lower case and upper case are handled differently through the machine. As an example, words Evaluation of Descriptive Answer by using Probability Approach, Cosine Similarity and Pretrained model Bhagat Gayval *1 and Vanita Mhaske 2 *1 Department of Statistics, SP College, Pune, Maharashtra, India. gayvalbk@gmail.com 2 Department of Computer Science, PVG’s College of Science and Commerce, Pune, Maharashtra, India. vanitamhaske04@gmail.com