Data Model of Echocardiogram Video for Content Based Retrieval Aditi Roy 1 , V. Pallavi , Avishek Saha , Shamik Sural 1 , J. Mukherjee , and A.K. Majumdar 2 2 2 2 1 School of IT, Department of Computer Science and Engineering 2 IIT, Kharagpur, India E-mail: {aditi.roy, shamik}@sit.iitkgp.ernet.in, {pallavi, avishek, jay, akmj}@cse.iitkgp.ernet.in ABSTRACT In this work we propose a new approach for modeling and management of echocardiogram video data. We introduce a hierarchical state-based model for representing an echo video using objects present and their dynamic behaviors. At first the echo video is segmented based on ‘view’, which describes the presence of objects. Object behavior is described by states and state transition using state transition diagram. This diagram is used to partition each view segment into states. We apply a novel technique for detecting the states using Sweep M-mode. For efficient retrieval of information, we propose indexes based on views and states of objects. The proposed model thus helps to store information about similar types of video data in a single database schema and supports content based retrieval of video data. KEY WORDS Echocardiogram video, View, State, Sweep M-mode, Hierarchical state-based modeling. 1. Introduction Echocardiography is a popular sonic method to analyze cardiac structures and function [1]. In the current paper, we address the issue of analyzing the spatio-temporal structure of echo videos for the purpose of temporal segmentation, browsing and efficient content based retrieval. In recent past, advances have been made in content-based retrieval of medical images. Researches on echo video summarization, temporal segmentation for interpretation, storage and content-based retrieval of echo video based on views have been reported [2] [3]. But this method is heavily dependent on the available domain knowledge, like, spatial structure of the echo video frames in terms of the ‘Region of Interest’ (ROI). On the other hand, an approach towards semantic content-based retrieval of video data using object state transition model has been put forward in [4][5]. In their work, they segment echo videos based on states of the heart object. Thus, view-based modeling and state-based modeling of echo video are done separately. But hierarchical state-based modeling, combining views and states, to store information about similar types of video data in a single database schema for efficient content based retrieval of video data, is an untouched problem to the best of our knowledge. In our work, we do hierarchical segmentation of echo video based on views and states of the heart object by exploiting specific structures of the video. The advantage of our approach is that it allows storage and indexing of the echo video at different levels of abstraction based on semantic features of video objects. The rest of the paper is organized as follows. In Section 2, we explain modeling of echo video based on views and states. In Section 3, we discuss the proposed system architecture, and we conclude in Section 4 of the paper. 2. Modeling Echocardiogram Video based on Views and States Modeling of video data is important for defining the object schema used in database. In case of echo video, our aim is to retrieve dynamic behavior of the video objects. To achieve this, a hierarchical state-based model has been introduced. Due to the temporal nature of video data, the visual information must be structured and broken down into meaningful components. 2.1 Video Segmentation based on View As mentioned above, the first step of video processing for hierarchical state-based modeling is segmentation of the input video into shots. A shot can be defined as a sequence of interrelated frames captured from the same camera location that represents a continuous action in time and space. In echo video segmentation, traditional definition of shot is not applicable. Echocardiogram video is obtained by scanning the cardiac structure with an ultrasound device. Hence, depending on transducer placement, different views of echo video are obtained. In this paper the views considered are: Parasternal Long Axis View, Parasternal Short Axis view, Apical View, Color Doppler, One dimensional image. Echocardiogram View Detection and Classification. We explore two methods to detect shot boundary, namely, histogram based comparison and edge change ratio. We combine them using majority voting to detect shots in echo video and obtain 98% accuracy [6].