Journal of Signal Processing Systems
https://doi.org/10.1007/s11265-018-1375-6
Hand Sign Recognition for Thai Finger Spelling: an Application
of Convolution Neural Network
Pisit Nakjai
1
· Tatpong Katanyukul
1
Received: 11 September 2017 / Revised: 11 January 2018 / Accepted: 26 April 2018
© Springer Science+Business Media, LLC, part of Springer Nature 2018
Abstract
The finger spelling is a necessary part of Sign Language—an important means of communication among people with
hearing disability. The finger spelling is used to spell out names, places or signs that have not yet been defined. A
sign recognition system attempts to allow better communication between hearing majority and hearing disability people.
Our study investigates Thai Finger Spelling(TFS), its unique characteristics, a design of automatic TFS recognition, and
approaches to handle a TFS key potential issue. Our research designs automatic TFS recognition as a two-stage pipeline: (1)
locating and extracting a signing hand on the image and (2) classifying the signing image into the valid TFS sign. Signing
hand is located and extracted based on color scheme and contour area using Green’s Theorem. Two approaches are examined
for signing image classification: Convolution Neural Network(CNN)-based and Histogram of Oriented Gradients(HOG)-
based approaches. Our experimental results have shown the viability of the proposed pipeline, which achieves mean Average
Precision (mAP) at 91.26. The proposed design outperforms state-of-the-arts in automatic visual TFS recognition. In a
practical sign recognition system, invalid TFS signs may appear in sign transition or simply from unaware hand postures.
We proposed a formulation, called confidence ratio. Confidence ratio is simple to compute and generally compatible with
multi-class classifiers. The confidence ratio has been found to be a promising mechanism for identifying invalid TFS signs.
Our findings reveal challenging issues related to TFS recognition, practical design for TFS sign transcription, formulation
and effectiveness of confidence ratio.
Keywords Sign language transcription · Thai sign recognition · Thai Finger Spelling · Convolution Neural Network ·
Open-set recognition
1 Introduction
Face-to-face communication is essential as a communica-
tion channel as well as a sense of psychological connection
through verbal communication. Having hearing difficulty,
deaf people rely on writing and sign language. Writing is
less personal and slower than face-to-face communication.
Deaf people have been reported to develop slower writing
skill than the normal hearing majority [3].
Pisit Nakjai
mynameisbee@gmail.com
Tatpong Katanyukul
tatpong@gmail.com
1
Department of Computer Engineering, Khon Kaen University,
Khon Kaen, Thailand
In addition, a widely used video-telephony service,
e.g., FaceTime, affects deaf people. Deaf people can use
videophone to communicate with other deaf people, but
this situation is difficult between deaf people and hearing
majority. Interpretation is needed to translate sign language
to text for better communication.
As a convenient and more personal alternative, sign lan-
guage plays a crucial part in the deaf community. However,
sign language is not universal. There are many sign lan-
guages, e.g., American Sign Language(ASL), British Sign
Language(BSL), Chinese Sign Language(CSL), Japanese
Sign Language(JSL) and Thai Sign Language (TSL).
Any sign language usually has two schemes, sign and
finger spelling schemes. A sign scheme is defined as usage
of hand gestures, facial expressions and actions to convey
meaning, attitude and sentiment. Finger spelling scheme is
defined as usage of hand gestures to represent alphabets
in the corresponding language. Finger spelling can be used