Multimed Tools Appl DOI 10.1007/s11042-017-4708-8 Improving content-based image retrieval for heterogeneous datasets using histogram-based descriptors Carolina Reta 1 · Ismael Solis-Moreno 2 · Jose A. Cantoral-Ceballos 3 · Rogelio Alvarez-Vargas 3 · Paul Townend 4 Received: 5 July 2016 / Revised: 3 March 2017 / Accepted: 12 April 2017 © Springer Science+Business Media New York 2017 Abstract Image content analysis plays a key role in areas such as image classification, clustering, indexing, retrieving, and object and scene recognition. However, although sev- eral image content descriptors have been proposed in the literature, their low performance score or high computational cost makes them unsuitable for content-based image retrieval on large datasets. This paper presents an efficient content-based image retrieval approach that uses histogram-based descriptors to represent color, edge, and texture features, and a k- nearest neighbor classifier to retrieve the best matches for query images. The compactness and speed of the proposed descriptors allow their application in heterogeneous photographic collections whilst showing strong image discrimination in the presence of significant con- tent variation. Experimentation was conducted on four different image collections using four distance metrics. The results show that the proposed approach consistently achieves Carolina Reta carolina.reta@ciateq.mx Ismael Solis-Moreno ismael@mx1.ibm.com Jose A. Cantoral-Ceballos jose.cantoral@ciateq.mx Rogelio Alvarez-Vargas ralvarez@ciateq.mx Paul Townend p.m.townend@leeds.ac.uk 1 Division of IT, Electronic & Control, CONACYT-CIATEQ, Av. Diesel Nacional No. 1 Ciudad Sahagun, Hidalgo 43990, Mexico 2 Mexico Software Lab., IBM, Guadalajara, Jalisco, Mexico 3 Division of IT, Electronic & Control, CIATEQ, El Marques, Queretaro, Mexico 4 School of Computing, University of Leeds, Leeds, UK