1 Image Retrieval Using Keywords: The Machine Learning Perspective Zenonas Theodosiou Dept. of Communication and Internet Studies, Cyprus University of Technol- ogy, Limassol, Cyprus Nicolas Tsapatsoulis Dept. of Communication and Internet Studies, Cyprus University of Technol- ogy, Limassol, Cyprus CONTENTS 1.1 Introduction ............................................................... 3 1.2 Background ............................................................... 4 1.2.1 Key Issues in Automatic Image Annotation ...................... 7 1.3 Low-level Feature Extraction ............................................. 9 1.3.1 Local Features ..................................................... 10 1.3.2 Global or Holistic Features ........................................ 11 1.3.3 Feature Fusion ..................................................... 13 1.4 Visual Models Creation ................................................... 14 1.4.1 Dataset Creation .................................................. 15 1.4.2 Learning Approaches .............................................. 16 1.5 evaluation performance Evaluation ....................................... 19 1.6 A study on creating visual models ........................................ 21 1.6.1 Feature Extraction ................................................ 22 1.6.2 Keywords Modeling ............................................... 22 1.6.3 Experimental Results ............................................. 25 1.7 Conclusion ................................................................. 29 1.1 Introduction Given the rapid growth of available digital images, image retrieval has at- tracted a lot of research interest the last decades. Image retrieval research efforts are falling into content-based and text-based methods. Content-based methods retrieve images by analyzing and comparing the content of a given image example as a starting point. Text-based methods are similar to docu- ment retrieval and retrieve images using keywords. The latter is the approach of preference both for ordinary users and search engine engineers. Besides the fact that the majority of users are familiar with text-based queries, content- 3