Monitoring the security of audio biomedical signals communications in wearable IoT healthcare Saeid Yazdanpanah a , Saman Shojae Chaeikar b, * , Alireza Jolfaei c a Department of Computer Engineering, Khorramabad Branch, Islamic Azad University, Khorramabad, Iran b Australian Institute of Higher Education, Sydney, Australia c College of Science and Engineering, Flinders University, Adelaide, Australia ARTICLE INFO Keywords: Audio security Audio signal processing Data hiding Healthcare data IoT security ABSTRACT The COVID-19 pandemic has imposed new challenges on the healthcare industry as hospital staff are exposed to a massive coronavirus load when registering new patients, taking temperatures, and providing care. The Ebola epidemic of 2014 is another example of a pandemic which a hospital in New York decided to use an audio-based communication system to protect nurses. This idea quickly turned into an Internet of Things (IoT) healthcare solution to help to communicate with patients remotely. However, it has grabbed the attention of criminals who use this medium as a cover for secret communication. The merging of signal processing and machine-learning techniques has led to the development of steganalyzers with very higher efciencies, but since the statistical properties of normal audio les differ from those of purely speech audio les, the current steganalysis practices are not efcient enough for this type of content. This research considers the Percent of Equal Adjacent Samples (PEAS) feature for speech steganalysis. This feature efciently discriminates the least signicant bit stego speech samples from clean ones with a single analysis dimension. A sensitivity of 99.82% was achieved for the steg- analysis of 50% embedded stego instances using a classier based on the Gaussian membership function. 1. Introduction Cryptography is the science of making data unintelligible to unau- thorized people without concealing the existence and other critical in- formation about the encrypted data such as duration and frequency of communications, message size, and sender and recipient's identities to the adversaries [1,2]. Considering the need for concealed communica- tions, steganography was designed as another security method that can conceal secrets within the body of digital media, enhance privacy, pre- vent trafc analysis, and allow the transfer of secrets in an imperceptible manner. However, considering the applications of steganography, digital forensic scientists have designed a countermeasure that allows them to supervise potential secret data exchanges that may be performed by hackers, terrorists, and lawbreakers [3,4]. Image, video, and audio are the digital media formats that have been widely used by steganographers as cover media [4,5]. Among these formats, audio les occupy an outstanding proportion of today's media transfer, and therefore, create magnicent opportunities for illegal data transfer without grabbing attention. Notwithstanding, digital forensic scientists have paid less attention to the audio format, as compared to images, for developing state-of-the-art audio steganalysis algorithms, and this domain requires modern steganalyzers that exploit the statistical characteristics of audio formats [6]. As shown in Fig. 1, steganalysis methods can be classied according to the signal characteristics. Multiplicative, phase-encoding, and echo-embedding are the three targeted steganalysis subclasses. The process of multiplicative subclass may be described as s[n] ¼ c[n](1 þ m[n]) where c[n], m[n] and s[n] are cover, secret message, and stego, respectively [7]. Normal steganalyzers cannot efciently detect multiplicative steganography, and hence, this class adds absolute value logarithms of audio samples to the steganalysis discrimination factor [1]. Phase coding, the second targeted class, is based on the fact that relative phases between blocks are preserved, while that amongst consecutive blocks are changed [8]. The third class, namely, echo embedding, holds a bank of kernels in which one selected kernel convolves with segmented parts of the cover audio le [9]. Positive-Negative (PN) and Forward-Backward (FB) are two sample echo embedding kernels. The universal class can be classied into calibrated and non- calibrated branches. In the calibrated methods, the challenge is to nd features that have been formed mostly based on the embedded message instead of the original signals [1]. The rst subclass in calibrated methods * Corresponding author. E-mail address: s.chaeikar@aih.nsw.edu.au (S. Shojae Chaeikar). Contents lists available at ScienceDirect Digital Communications and Networks journal homepage: www.keaipublishing.com/dcan https://doi.org/10.1016/j.dcan.2022.11.002 Received 22 December 2021; Received in revised form 27 October 2022; Accepted 1 November 2022 Available online 14 November 2022 2352-8648/© 2022 Chongqing University of Posts and Telecommunications. Publishing Services by Elsevier B.V. on behalf of KeAi Communications Co. Ltd. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/). Digital Communications and Networks 9 (2023) 393399