A methodology for image annotation of human actions in videos Moomina Waheed 1 & Shahid Hussain 1 & Arif Ali Khan 2 & Mansoor Ahmed 1 & Bashir Ahmad 3 Received: 8 July 2019 /Revised: 2 April 2020 /Accepted: 22 May 2020 # Springer Science+Business Media, LLC, part of Springer Nature 2020 Abstract In the context of video-based image classification, image annotation plays a vital role in improving the image classification decision based on it’s semantics. Though, several methods have been introduced to adopt the image annotation such as manual and semi- supervised. However, formal specification, high cost, high probability of errors and computation time remain major issues to perform image annotation. In order to overcome these issues, we propose a new image annotation technique which consists of three tiers namely frames extraction, interest point’s generation, and clustering. The aim of the proposed technique is to automate the label generation of video frames. Moreover, an evaluation model to assess the effectiveness of the proposed technique is used. The promising results of the proposed technique indicate the effectiveness (77% in terms of Adjusted Random Index) of the proposed technique in the context label generation for video frames. In the end, a comparative study analysis is made between the existing techniques and proposed methodology. Keywords Image annotation . SIFT . Clustering . Semantic analysis . Image labeling . Action recognition https://doi.org/10.1007/s11042-020-09091-2 * Shahid Hussain shussain@comsats.edu.pk Moomina Waheed moominawaheed@gmail.com Arif Ali Khan arif.khan1@nuaa.edu.cn Mansoor Ahmed mansoor1@comsats.edu.pk Bashir Ahmad bashahmad2@gmail.com Extended author information available on the last page of the article Multimedia Tools and Applications (2020) 79:24347–24365 / Published online: 20 June 2020