British Journal of Science 1 December 2016, Vol. 14 (2) © 2016 British Journals ISSN 2047-3745 Improved Fuzzy C-means for Document Image Segmentation Abstract Interest in the automatic analysis and segmentation of document images has been increased during the recent years. Also, the document segmentation plays an important role in document analysis, since every day, thousands of documents including government files, technical reports, books, Newspapers, magazines, etc, need be processed and provided an intelligent access to its contents both the text and non-text components. Lots of time, money and effort will be preserving whenever it can be executed automatically. Hence, this paper introduced a new document image segmentation approach based on suggested improved fuzzy C-means (IFCM) that focus on segmented the text, images and background pixels from the scanned document images using the statistical features of regions pixels, collected areas and then clustered in text and non-text areas. In this approach, a document image is segmented to several non-overlapping regions via a novel recursive clustering technique relies on the statistical features of each pixel with its neighborhoods. The performance of this method is evaluated by examining a variety of complex document images such as newspaper layouts and artificially segmented the text and images. Also the performance has been recorded in terms of quantitative and qualitative measures. The experimental performance results are promising and encouraging without the need of any assistant techniques for pixel segmentation, unlike many techniques of this class. Since they prove the feasibility and practicality of IFCM and can provide near-optimal solutions to document layout analysis problems. They achieved accuracy rate 95.21% and recall rate around 97.51% on a set of 390 documents that confirms the robustness of suggested algorithm. However, the overall precision is higher due to the different evaluation metrics. Although the pixel wise evaluation allows for more accurate improvement, this evaluation metrics reflects the objective of the IFCM. Keywords: Fuzzy C-means (FCM); Fuzzy C-means Algorithm, Clustering Algorithms, Cluster Center, Clustering, Document Image Segmentation, Document Image Analysis, Document Image, Document Image Segmentation. Introduction Nowadays, a wide variety of information is being available and converted into electronic format for efficient storage and processing. This needs handling of documents using image analysis techniques. The document analysis techniques decompose the document image into different consistent items which represent the consistent components of the documents image such as text, graphics and tables, without a prior knowledge of specific format. Document images are frequently generated from physical documents via digitization using scanner devices or digital cameras. Various documents, such as newspapers, and magazines, contain very complex layout. Automatic analysis of a document with complex structure and layout is considered a difficult task and not within the capabilities of the current document layout analysis systems. Hasanen S. Abdullah University of Technology Computer sciences Department E-mail: qhasanen@yahoo.com Ammar H. Jassim University of Baghdad/ College of Science for women Department of computer Science E-mail: ammar_hussein_2004@yahoo.com