This is a draft version. The final version is available at http://link.springer.com/chapter/10.1007/978-3-642-34109-0_25 Basic Word Completion and Prediction for Hebrew Yaakov HaCohen-Kerner 1 , Izek Greenfield 2 1 Dept. of Computer Science, Jerusalem College of Technology, 91160 Jerusalem, Israel kerner@jct.ac.il 2 CISCO ltd., 5 Shlomo Halevi St., Har Hotzvim, 97770 Jerusalem, Israel izekgri@gmail.com Abstract. This research aims to improve keystroke savings for completion and prediction of Hebrew words. This task is very important to augmentative and alternative communication systems as well as to search engines, short messages services, and mobile phones. The proposed model is composed of Hebrew corpora containing 177M words, a morphological analyzer, various n-gram Hebrew language models and other tools. The achieved keystroke savings rate is higher than those reported in a previous Hebrew word prediction system and previous word prediction systems in other languages. Two main findings have been found: the larger the corpus that the language model is trained on, the better predictions that are achieved and a morphological analyzer helps only when the language model is based on only one corpus. Keywords: Augmentative and alternative communication, Corpora, Hebrew, Keystroke savings, Language models, Word completion, Word prediction. 1 Introduction Word prediction is the suggestion of relevant words, in response to a user's keystrokes. Word prediction is mainly used in systems that help people with physical disabilities to increase their typing speed [1] and to decrease the number of keystrokes needed in order to complete a word [2]. The main aims of word prediction are to speed up typing and to reduce writing errors (especially for dyslexic people). Word completion and prediction are also very common in search engines and short messages services, mobile phones with their limited keyboard and of hand-held devices. After a user types the beginning of a word, the system usually offers a list of relevant words or in some cases automatically completes the word. The main evaluation measure for word prediction is keystroke savings (KS) [3, 4, 56]. KS measures the saving percentage in keys pressed compared to letter-by-letter text entry. KS is computed using the following formula: (chars– keystrokes)/chars× 100, where chars represents the number of characters in the text, including spaces and newlines, and Keystrokes is the minimum number of key presses required to enter the