Solving for the RC4 stream cipher state register using a genetic algorithm Benjamin Ferriman School of Computer Science University of Guelph Guelph, ON N1L 1L9 Charlie Obimbo School of Computer Science University of Guelph Guelph, ON N1L 1L9 Abstract—The RC4 stream cipher has shown to be quite resilient to cryptanalysis for the 26 years it has been around. The algorithm is still one of the most widely used methods of encryption over the Internet today being implemented through the Secure Socket Layer and Transport Layer Security protocols. Genetic algorithms are a sub-class of evolutionary algorithms that have been used to help solve many different problems of optimization in a variety of disciplines. In this paper we will examine the abilities of the genetic algorithm as a tool to help solve the permutation that is stored as the state register of the RC4 stream cipher. Finally, we will show that on average the genetic algorithm can solve 100% of the keystream in 2 121.5 generations. I. I NTRODUCTION Over the past twenty years the Internet has evolved astro- nomically as a tool for education, pleasure, and economics, to name a few applications. In today’s society there are very little tasks in one’s daily life which are not facilitated by the Internet in some way, shape, or form. As the services available over the Internet continue to expand, new and old problems of security arise and must be accounted for in order to properly facilitate these applications. One of the largest sectors which continues to grow is on-line banking and other electronic financial transactions (conveniently distinguished as E-Commerce). These two appli- cations face many of the same problems as traditional physical banking, but also a new set of challenges that have amounted due to the use of the Internet. The most obvious contemporary issue is that of communication. Traditionally a customer simply communicated with a bank teller where the environment could be controlled as well as the manor of communication (i.e. whether something could simply be conveyed through speech or read privately by the customer). With the advent of on- line banking there is an unknown communication between the customer (client computer) and the teller (bank servers). The very fact that on-line banking improves the ease of use for a customer by virtually letting them do their banking anywhere with an Internet connection also hinders their ability to know specifically how their private communication with the bank system is being conducted. Besides this, there it also the convenient and ubiquitous use of mobile computing. With advent of smart-phones, the main use of the Internet is quickly shifting to being used mainly in the mobile computing environment. According to PewResearch [?] as of May 2013, 63% of adult cell owners use their phones to go online and 34% of cell internet users go online mostly using their phones, and not using some other device such as a desktop or laptop computer. As can also be seen on Table I, obtained from the United States Census Bureau Data [?], the younger population, between the ages of 10 and 90 comprise over 60% of the population of the US, and according to PewResearch, as can be seen on Table II, about three-quarters of these have and use Smartphones. TABLE I: Demographics of the US population, 2012 Cummulative Age Population Percentage Percentage All ages 308,827 100.0% Under 5 20,110 6.5% 6.5% 5 - 9 20,416 6.6% 13.1% 10 - 14 20,605 6.7% 19.8% 15 - 19 21,239 6.9% 26.7% 20 - 49 124,607 40.3% 67.0% 50 - 59 42,842 13.9% 80.9% 60 - 64 17,501 5.7% 86.6% 65 & older 41,506 13.4% 100.0% TABLE II: Smartphone owners in 2014 [?] Have a smartphone All Adults 58% Gender a. Men 61% b. Women 57% Race a. White 53% b. African American 69% c. Hispanics 61% Age Group a. 18 - 29 83% b. 30 - 49 74% c. 50 - 64 49% d. 65+ 19% With new attacks on Internet-based encryption protocols coming to light in the past four months, a lot of focus has shifted from traditional forms of cryptanalysis to methods of circumvention to attack these ciphers. One cipher that is still widely used and investigated is the RC4 stream cipher. Due to it’s simplicity and robustness (efficient for both software and hardware) [1], the RC4 stream cipher is one of the most implemented encryption schemes online and over computer networks. It’s usage is seen in the Secure Socket Layer (SSL) [2] and Transport Layer Security (TLS) [3] (IJACSA) International Journal of Advanced Computer Science and Applications, Vol. 5, No. 5, 2014 216 | Page www.ijacsa.thesai.org