Abstract Nowadays, a broad range of speech recognition technologies (such as Apple Siri and Amazon Alexa) are developed as the user interface has become ever convenient and prevalent. Machine learning algorithms are yielding better training results to support these developments in Automatic Speech Recognition (ASR). However, most of these developments have been in languages with worldwide, political, economic and/or scientific influence such as English, Japanese, German, French, and Spanish, just to name a few. On the other hand, there has been little or no development of ASR systems (or language technologies) in most minority and under-resourced languages of the world, especially those spoken in Sub-Sahara Africa. One of such languages is the Ngiemboon language which is the focus of this paper. The Ngiemboon language is a Grassfield Bantu language spoken in the West Region of Cameroon (Africa) by about 400,000 people. This paper highlights the motivations, challenges and perspectives inherent in a work in progress (speech data collection is underway) to build a Deep Learning based Automatic Speech Recognition System for this minority under- resourced Cameroonian local language. This paper introduces the issues critical to conducting research in Speech Processing in this language 1. Introduction Automatic Speech Recognition is “the process and the related technology for converting the speech signal into its corresponding sequence of words or other linguistic entities by means of algorithms implemented in a device, a computer, or computer clusters” (Li and O'Shaughnessy, 2003). As an active field of research, Automatic Speech Recognition has told significant stories for a few decades. “Early attempts to design systems for automatic speech recognition were mostly guided by the theory of acoustic-phonetics, which describes the phonetic elements of speech (the basic sounds of the language) and tries to explain how they are acoustically realized in a spoken utterance” (Juang and Rabiner, 2005). These efforts date back to the early 50s. Since then, ASR has yielded incredible development in a broad range of commercial technologies where Speech Recognition as the user interface has become ever useful and pervasive. However, most of these developments have been in languages with strong scientific, political, and/or economic influences such as English, German, French, and to some extent Japanese and Spanish, just to name a few. Historically, most of these languages have always enjoyed social prestige and their extensive vocabulary has given them prominence in the world of commerce. It is worth noting that ASR research and innovation in these languages are significant and continuous. On the contrary, there has been little or no research and development efforts in ASR and other Human Language Technologies in most minority languages of the world, particularly those spoken in Sub-Sahara Africa. Yet, these languages serve as the main vector for the socio- economic development of communities where they are spoken. In this paper, we highlight the motivations, challenges, and perspectives that must be considered in building Human Language Technologies, more precisely an Automatic Speech Recognition System for the Ngiemboon language. 1.1 Paper objective and contribution A surge of interest in the development of technologies in African languages is emerging. The African Languages in the Field: speech Fundamentals and Automation (ALFFA)1 project (spearheaded in France by the “Laboratoire Informatique de Grenoble” 1 http://alffa.imag.fr/ Motivations, Challenges, and Perspectives for the Development of a Deep Learning based Automatic Speech Recognition System for the Under-resourced Ngiemboon Language Patrice A. Yemmene School of Engineering University of Saint Thomas, MN, USA yemm2299@stthomas.edu Laurent Besacier Laboratoire Informatique de Grenoble University of Grenoble, France laurent.besacier@univ-grenoble-alpes.fr