Dynamic vocal fold imaging with combined optical coherence tomography/high-speed video endoscopy Nicusor Iftimia 1 , Gopi Maguluri 1 , Ernest Chang 1 , Jesung Ppark 1 , James Kobler 2 and Daryush Mehta 2 1. Physical Sciences Inc., 20 New England Business Center, Andover, MA 01810 2. Center for Laryngeal Surgery and Voice Rehabilitation, Massachusetts General Hospital, 55 Fruit Street, Boston MA 02114 1. Introduction Voice disorders due to trauma (e.g., intubation injuries, vocal abuse) and disease (e.g., dysplasia, cancer, recurrent respiratory papillomatosis, nodules, polyps, and scar) are currently evaluated in otolaryngology clinics using endoscopic imaging techniques such as videostroboscopy [1] or high-speed videoendoscopy (HSV) [2]. Clinicians couple these visual observations of vocal fold tissue motion with auditory-perceptual judgments of voice quality as part of a comprehensive assessment of the health and function of the larynx during phonation [3]. Endoscopic imaging, however, only provides two- dimensional (2D) spatial information and thus only quantifies the lateral tissue motion, critically lacking vertical axis information. Various high-speed imaging techniques have attempted to capture vocal fold surface motion during phonation in three spatial dimensions [4–6]; however, they either lack adequate spatial resolution or have not been validated in vivo. In this paper, we present a dual modality imaging approach, where optical coherence tomography (OCT) imaging augments HSV by providing the missing depth information. OCT quantitatively measures the vertical motion of the vocal fold surface during phonation with micron scale resolution. Furthermore, it also provides the subsurface structural morphology to depth to at least 1.5 mm. Therefore, the combination of these two modalities within the same instrument seems to be a suitable approach for examining the pathology of the vocal folds. 2. Methods A common optical imaging path swept source OCT/high speed video (SSOCT/HSV) endoscopy instrument was developed and used on an ex vivo study on excised animal tissue specimens. A simplified schematic of the instrument is shown in Figure 1. The OCT component of the system uses a swept source (SS) approach. The light source (Santec) operates at a scan rate of 20 kHz, while providing a broad spectrum light with a 3dB bandwidth of 100 nm at a center wavelength of 1310 nm. This enables subsurface tissue imaging with an axial resolution better than 10 m. A fiber optic interferometer and a balance detector are used to generate interference fringes, by combining the retro-reflected light from the imaged sample with that from a mirror, placed in the reference arm of the interferometer. A constructive interference occurs when the path- length difference between the two arms of the interferometer is within the coherence range of the light source. Each wavelength sweep of the light source is thus used to generate what is called an OCT A-scan. An individual A- scan is thus used to generate a sample reflectivity profile. The A-scan signals from the balanced detector are fed to a custom built FPGA module that acquires data when receives A-scan triggers from a custom-designed trigger circuit. These triggers are sent only when a signal from the vocal fold pressure sensor are detected, such that each full movement cycle of the vocal folds, also called phonation cycle, is digitized on a reasonable number of points (N>10). The same triggers are fed to the HSV camera, enabling a perfect temporal correlation of the OCT and HSV data. The digitized signals are sent through a camera link interface to a frame grabber (NI Fig. 2. OCT-HSV instrument. Setup showing (a) the OCT probe head mounted on the high-speed camera for co-linear imaging through a rigid endoscope and (b) the ex vivo calf larynx being imaged. (a) ( b Figure 1. Simplified schematic of the SSOCT/HSV instrument