Influencing Factors on Classification of Photographic and Computer Generated Images Ahmed Talib, Massudi Mahmuddin, Husniza Husni and Loay E. George Abstract—Classification of images into photographic (PG) and computer graphic (CG) images is useful in many applications such as web searching, image indexing and video classification. Distinguishing between PG and CG is still a challenging task, in spite of many studies that have been conducted. Their attained accuracy remained behind the acceptable level, ranging from 70% to 90%. Those studies claim that their systems produce good results but actually this occurs in a limited domain (for specific datasets). This paper presents components of classification system and techniques used in these components extensively, and highlight the important factors that influence each component. Moreover, effectiveness of these factors on three terms of performance (speed, accuracy and diversity) is discussed. This study guides the researchers to contribute and improves the results in this field by providing them the influencing and important factors. Index Terms—Image Classification; Computer Generated Images; Influencing Factors; Machine Learning. —————————— —————————— 1 INTRODUCTION HE widespread of the digital cameras produces large amounts of images, which require managing and classification of these images. Image can be classified according to ways in which images are generated into photographic (PG) and computer graphic (CG) images. PG images refer to the images captured by digital camer- as while CG images refer to images that are created by a computer or generated by rendering software. Classifica- tion between PG and CG images is useful in many appli- cations such as web and desktop image search, image indexing, video classification and other image processing applications. Distinguishing between PG and CG images is a challenging task to many researchers. Breakthroughs in this field can reduce, to a certain percentage, the image forgery in criminal investigation, journalism, and intelli- gence services. Any classification system has two stages: first stage is features extraction, and second stage is classi- fication stage. In this paper, we investigate and find the factors that affected each of the two stages of the classifi- cation system. In the next sections, details of these stages will be explained. Section 3 identifies factors that influecing on classification system’s stages. Section 4 and 5 contain discussion and conclusion of this paper. 2 CLASSIFICATION SYSTEM DESCRIPTION Feature extraction stage is considered as the heart of the classification system. Classification stage classifies images based on the features extracted from the first stage. Both parts are influenced by different factors that may affect their performances. In the following sub sections, we shall determine these stages and discuss their associated fac- tors. 2.1 Feature Extraction Stage Different features are extracted to distinguish between PG and CG. The features can be divided into two categories: (i) features based on visual content of an image, and (ii) features based on physical characteristics of an image. 2.1.1 Visual-Based Features Many researchers use visual features to differentiate be- tween PG and CG images [1-4]. These features usually used together with their measures such as statistical and spatial measures because they are highly correlated (i.e.: there are many statistics and spatial measures computed to some visual features to get the final features, to use them in the classification process such as the color histo- gram as statistical and color moments as spatial measures). Athitsos, Swain and Frankel [1] addressed some fea- tures to differentiate between PG and CG images that considered the base for all subsequent researches in this field. These features are color histogram, farthest neigh- bor, prevalent color, farthest neighbor histogram, satura- tion, number of colors, smallest dimension, and dimen- sion ratio. Lienhart and Hartmann [2] selected some pow- erful features that used in [1] for classification, using AdaBoost algorithm. In addition, they try to differentiate between real photos and computer-generated but realis- tic-looking image by measuring noise using median and Gaussian filter. Furthermore, they classify graphical im- ages itself into presentation slides/scientific posters and ———————————————— A. Talib is PhD Student in School of Computing, College of Arts and Sci- ences, Universiti Utara Malaysia, UUM, 06010 Sintok, Kedah, Malaysia. On leave from Foundation of Technical Education, Baghdad, Iraq. M. Mahmuddin and H. Husni are PhD holders and senior lecturers in Gr- aduate Department, School of Computing, College of Arts and Sciences, U- niversiti Utara Malaysia, UUM, 06010 Sintok, Kedah, Malaysia. L. E. George is Ass. Prof. in Computer Science Department, College of Sc- ience, Baghdad University, Al-Jadriya, 10071, Baghdad, Iraq. T JOURNAL OF COMPUTING, VOLUME 4, ISSUE 2, FEBRUARY 2012, ISSN 2151-9617 https://sites.google.com/site/journalofcomputing WWW.JOURNALOFCOMPUTING.ORG 74