Mask R-CNN: A Comparative Study on Improvements in Object Detection and Segmentation Shashwat Shukla 1 , Narayana Darapaneni 2 , Anwesh Reddy Paduri 3 , Sudha B G 4 , Abhishek K 5 , Arun Kumar BS 6 , Chandrashekar KP 7 , Deepak Kumar T P 8 , Prajwal S Shetty 9 , Rahul Kumar Verma 10 IIIT Lucknow, India 1,10 Northwestern University/Great Learning, US 2 Great Learning, Bangalore, India 3,4 PES University, Bangalore, India 5−9 Corresponding author: Anwesh Reddy Paduri, Email: anwesh@greatlearning.in Over the years there has been an increasing demand for image recognition as the world is moving towards a digital space. With the increasing demands, the application of Mask RCNN and expanded algorithms based on segmentation and YOLO have seen a major rise in the last 5 years. So the accuracy of the models has improved slightly since most projects take diferent approaches to the diferent datasets and have diferent metrics of Evaluation. By cross-referencing these approaches, a trend is observed that leads to higher success with the implementation of models using the hyperparameters and extra layers that have been added to the Mask RCNN in sequences. In this paper, the diferent approaches taken by diferent researchers are explored to understand how the implementations have progressed over the last half of the decade. In our research, we have studied analyzed 50 papers and found that the majority of papers were using the COCO dataset for training purposes with a specifed set of hyper-parameters to measure the accuracy, performance, and memory consumption. The experiment fndings were presented to suggest suitable RCNN architecture based on application or hardware attributes. Keywords: Object Detection, Segmentation, Fast RCNN, Faster RCNN, Mask R-CNN 2023. In Satyasai Jagannath Nanda & Rajendra Prasad Yadav (eds.), Data Science and Intelligent Computing Techniques, 93–102. Computing & Intelligent Systems, SCRS, India. https://doi.org/10.56155/978-81-955020-2-8-9