Multifractal Analysis of Electroencephalogram for Human Speech Modalities Debdeep Sikdar 1 Rinku Roy 2 and Manjunatha Mahadevappa 3 Abstract—Verbal communication makes human unique from other species. People use different modalities of speech while communicating with others. Widely practised modalities are speak loudly (utter), whispering and mumbling with closed lips. Apart from speaking, people also speak in their mind. Due to different ailments or injury, some people have lost their ability to speak and are forced to take other means to communicate. Speech restoration through Brain Computer Interfacing (BCI) is still at nascent stage. Through this study, we have explored the contrast between these modalities and it will lead to identification of imagined speech through electroencephalography (EEG). As different speech modalities are similar in nature in spatiotemporal domain, here we have proposed utilisation of nonlinearilty, more specifically multifractal nature, of the modalities present in EEGs. On the basis of the multifractal parameters we have achieved 99.7% accuracy in classification. I. I NTRODUCTION For communicating with the world, verbal communication is one of the most important things for human daily life living. However, in some conditions like Dysarthria, Apraxia, Progressive neurological disease, e.g. Alzheimer’s disease, dementia, Aphasia, a person is unable to produce proper speech or any speech at all. They need to depend on other means of communicating with others. Even in some cases, a person with disabilities may be in completely locked in state due to inaccessibility to those alternate means of communication. Hence an alternate interpreter for verbal communication is most essential for them to convey mes- sages and communicate with surroundings. Whenever we speak, our brain coordinates the oro- pharyngeal-laryngeal muscle group to produce proper speech. The term ’language centre’ refers to the areas of the brain which serve a particular function for speech processing and production. The classical brain-language model derived from the work of Broca, Wernicke, Lichtheim, Geschwind, and others has been useful as a heuristic model that stimu- lates research and as a clinical model that guides diagnosis [1]. Carl Wernicke created an early neurological model of language, that later was revived by Norman Geschwind. The model is known as the Wernicke-Geschwind model [2]. *This work was not supported by any organization 1 Debdeep Sikdar is a PhD scholar of School of Medical Sci- ence and Technology, Indian Institute of Technology Kharagpur, India deep@iitkgp.ac.in 2 Rinku Roy is a PhD scholar of Department of Advanced Technol- ogy Development Centre, Indian Institute of Technology Kharagpur, India rinku.roy87@gmail.com 3 Manjunatha Mahadevappa is Associate Professor of School of Medical Science and Technology, Indian Institute of Technology Kharagpur, India mmaha2@smst.iitkgp.ernet.in Ideally, a person suffering from speech related disabilities can imagine certain speech which produce proper brain signals require for necessary sequential muscle activation. This movement imagination from the brain signal can be decoded and used to produce that movement. It is possible to produce proper signals [3] for speech from the vocal cord movement imagination. Speech Synthesizer can be utilized to reproduce that exact speech. Usually acoustic speech recognition relies on frequency based features, extracted from the speech signal. So, a novel approach to study the nonlinearity of the speech modalities has been presented here. Multifractal Detrended Fluctuation Analysis (MFDFA) was utilised to extract the features from the EEGs. Here, we investigated changes in EEG in different modalities of speech production, namely: uttering, whispering, mumbling and imagined speech. All the four modalities for each dataset were applied with MFDFA to get four features namely, spectrum width, skewness, peak of spectra and generalized Hurst’s exponent. On the basis of this feature set, classiﬁca- tion accuracies were evaluated for each sub bands as well as band limited EEGs separately through k Nearest Neighbour with 10 fold cross validation. It was found that for each cases, the feature set was signiﬁcantly classifying different modalities of speech with an accuracy of over 95%. The Flowchart of this study is given in Fig. 1. II. METHODOLOGY A. Subjects Ten non-impaired subjects, aged between 20 and 27 years old were chosen. The subjects had no history of speech re- lated ailments. They were asked to perform different speech modalities according to the visual cues. None of the subjects were informed about the experimental procedure in prior to minimise biasness. Written consents were taken from the subject before participating. B. Task The subjects were seated on an armchair in front of a table. A screen was placed on the table for visual cues. Initially there was a blank screen with a crosshair at the centre. The subjects were asked to relax and concentrate on the instruction followed. Different vowels (a [/A/], e [/E/], i [/I/], o[/O/], u[/U/]) were shown randomly in the screen for 1 second. Between two successive vowels, a blank screen was shown for 5 seconds. The subjects were needed to perform different speech modalities for the vowels shown. At ﬁrst session, the subjects were asked to speak the vowels shown in the screen loudly. In the next session, the subjects were 978-1-5090-4603-4/17/$31.00 ©2017 IEEE 8th International IEEE EMBS Conference on Neural Engineering Shanghai, China, May 25 - 28 , 2017 637