(IJACSA) International Journal of Advanced Computer Science and Applications, Vol. 11, No. 9, 2020 Unified Approach for White Blood Cell Segmentation, Feature Extraction, and Counting using Max-Tree Data Structure Bilkis Jamal Ferdosi Department of Computer Science and Engineering University of Asia Pacific Dhaka, Bangladesh Abstract—Accurate identification and counting of White Blood Cells (WBCs) from microscopy blood cell images are vital for several blood-related disease diagnoses such as leukemia. The inevitability of automated cell image analysis in medical diagnosis results in a plethora of research for the last few decades. Microscopic blood cell image analysis involves three major steps: cell segmentation, classification, and counting. Several techniques have been employed separately to solve these three problems. In this paper, a simple unified model is proposed for White Blood Cell segmentation, feature extraction for classification, and counting with connected mathematical morphological operators implemented using the max-tree data structure. Max-tree creates a hierarchical representation of connected components of all possible gray levels present in an image in such a way that the root holds the connected components comprise of pixels with the lowest intensity value and the connected components comprise of pixels with the highest intensity value are in the leaves. Any associated attributes such as the size or shape of each connected component can be efficiently calculated on the fly and stored in this data structure. Utilizing this knowledge-rich data structure, we obtain a better segmentation of the cells that preserves the morphology of the cells and consequently obtain better accuracy in cell counting. KeywordsSegmentation; feature extraction; White Blood Cell (WBC); mathematical morphology; max-tree I. I NTRODUCTION Microscopic blood cell image analysis is crucial for the diagnosis of several blood-related diseases. It may require complete blood count (CBC) where a complete count of red blood cells, white blood cells, and platelets is investigated. In some cases, differential blood count (DBC) may be required where five different types of white blood cells: eosinophils, basophils, monocytes, lymphocytes, and neutrophils need to be separated and counted. Blood image analysis is also crucial in the diagnosis of leukemia where lymphoblasts are needed to be separated from the healthy WBCs and counted. Manual analysis by the human experts is time-consuming, the accuracy of the result vastly depends on the expert’s capability, and varying results may be obtained even if the procedure is repeated by the same expert. Thus, image-based analysis of blood cells gained much popularity in the past decades. Image-based automated blood cell analysis poses three major challenges: segmentation, feature extraction for classifi- cation, and counting of cells from very complex blood smear images. To solve the challenging problem of cell segmenta- tion, several approaches have been utilized in the literature. Clustering-based approaches such as expectation maximiza- tion (EM) [1], [2], K-means method [3], [4], the fuzzy C- means method [5], type-2 fuzzy logic [6], thresholding-based approach [7], edge detection based method [8], shape-based matching method [9], machine learning [10], or energy mini- mization [11], Gram-Schmidt orthogonalization [12], combin- ing several image processing techniques such as thresholding, k-means clustering, and modified watershed algorithm [13], morphological operators [14], etc. to mention a few. For classification different features such as morphological and textural features have been used [15]–[17]. Few others used genetic features extracted with the genetic algorithm [18]–[20]. Finally, different types of classified cells need to be counted. For counting, some methods that require prior cell segmentation and detection [21], few others approximated the number of cells from estimated density obtained from user annotation by compromising accuracy over speed [22], [23]. A method of cell counting based on morphological image analysis of blood cell images without requiring user annotation is reported in [24]. From segmentation to the counting of the cells widely varying techniques have been utilized in the literature. There is no unified approach that can facilitate in all three analysis steps of segmentation, feature extraction for classification, and counting. Inspired by the work in [24], this paper tries to use the full potential of Max-tree data structure which is an efficient structure for morphological connected operators. Morphological connected operators work on connected com- ponents of a gray level image known as flat zones and preserve only those flat zones that satisfy given criteria removing the rest of the flat zones [25], [26]. The criteria can be based on one or more attributes computed from the flat zones. Max-tree data structure enables the processing steps of these operators efficiently. Max-tree is a structured representation of an image where connected components with the highest intensity are in the leaves of the tree, the connected component with the lowest intensity is in the root, and the rest of the nodes hold the connected components for all threshold levels present in the image. Besides, the nodes of the tree are capable of storing a plethora of knowledge such as size and shape granulometry, texture, moment, or motion-oriented attributes. In this paper, the capability of this knowledge-rich data structure has been www.ijacsa.thesai.org 664 | Page