1
Lip language identification via Wavelet entropy and K-
nearest neighbor algorithm
Ran Wang
1
, Yifan Cui
1
, Xinyu Gao
1
, Wei Chen
1
, Mingbo Hu
1
, Qian Li
1
, Jiahui Wei
1
, XianWei Jiang
1,*
1
School of Mathematics and Information Science, Nanjing Normal University of Special Education, Nanjing 210038, China
Abstract
INTRODUCTION: Image processing technology is widely used in lip recognition, which can automatically detect and
analyse the unstable shape of human lips.
OBJECTIVES: In this paper, we propose a new algorithm using Wavelet entropy (WE) and K-nearest neighbor (KNN)
improves the accuracy of lip recognition.
METHODS: At present, the two most commonly used technologies are wavelet transform and -nearest neighbor algorithm.
Wavelet transform is a set of image descriptors, and the -nearest neighbor algorithm has high accuracy. After a large
number of experiments, we propose a lip recognition method based on Wavelet entropy and -nearest neighbor, which
combines Wavelet entropy, -nearest neighbor and K-fold cross validation.
RESULTS: This method reduces the calculation time and improves the training speed. The best result of the experiment
improves the accuracy to 80.08%.
CONCLUSION: Therefore, our algorithm is superior to other state-of-the-art approaches of lip recognition.
Keywords: Lip language identification, Wavelet entropy, -nearest neighbor, Wavelet transform, K-fold cross validation
Received on 29 June 2021, accepted on 05 August 2021, published on 11 August 2021
Copyright © 2021 Ran Wang et al., licensed to EAI. This is an open access article distributed under the terms of the Creative Commons
Attribution license, which permits unlimited use, distribution and reproduction in any medium so long as the original work is properly
cited.
doi: 10.4108/eai.11-8-2021.170669
*
Corresponding author. Email: jxw@njts.edu.cn
1. Introduction
1.1. What is Lip language identification
Lip speech recognition is a technology that combines
machine vision and natural language processing to identify
speech content directly from the image of someone
speaking. Lip recognition system using machine vision
technology, continuous identify faces from the image,
determine which is the speaker, to extract the person mouth
change characteristics of continuous, then enter the
characteristics of continuous variation to the lip recognition
model, identify the corresponding pronunciation speech
population type, then according to identify the
pronunciation, calculated that the most likely of natural
language statements. In the process of lip recognition, the
relationship between mouth shape and pronunciation,
pronunciation and text, is not the only corresponding, there
are often multiple possible alternative results, need to
calculate the most possible result in real time.
1.2. Literatures
In recent years, image processing techniques have been
extensively developed for human lip recognition, which
can automatically detect and analyse the unstable shape of
human lips and distinguish in real time whether the user is
speaking or not. Examples include audiovisual speech
EAI Endorsed Transactions
on e-Learning Research Article
EAI Endorsed Transactions on
e-Learning
04 2021 - 08 2021 | Volume 7 | Issue 22 | e4