© 2013, IJARCSSE All Rights Reserved Page | 106
Volume 3, Issue 6, June 2013 ISSN: 2277 128X
International Journal of Advanced Research in
Computer Science and Software Engineering
Research Paper
Available online at: www.ijarcsse.com
Understanding Captcha: Text and Audio Based Captcha
with its Applications
Sarika Choudhary, Ritika Saroha Yatan Dahiya Sachin Choudhary
M.tech (Network Security) M.tech (CSE) B.Tech
School of Engineering & Sciences Baba Mastnath Engineering College, Modern Institute of Technology
BPS Mahila Vishwavidyalaya MDU, Rohtak, Haryana and Research Centre, RTU,
Sonepat, Haryana, India India Kota, Rajasthan, India
Abstract— CAPTCHAs are short for Completely Automated Public Turing test to tell Computer and Humans Apart.
The purpose of a CAPTCHA is to block form submissions from spam bots – automated scripts that harvest email
addresses from publicly available web forms. The term "CAPTCHA" was coined in 2000 by Luis Von Ahn, Manuel
Blum, Nicholas J. Hopper (all of Carnegie Mellon University, and John Langford (then of IBM). CAPTCHAs are
used because of the fact that it is difficult for the computers to extract the text from such a distorted image, whereas it
is relatively easy for a human to understand the text hidden behind the distortions. Therefore, the correct response to
a CAPTCHA challenge is assumed to come from a human and the user is permitted into the website. The CAPTCHA
test helps identify which users are real human beings and which ones are computer programs.
Keywords— gimpy, Turing test, CAPTCHA, bongo, reCAPTCHA.
I. INTRODUCTION
You're trying to sign up for a free email service offered by Gmail or Yahoo. Before you can submit your application, you
first have to pass a test. It's not a hard test -- in fact, that's the point. For you, the test should be simple and
straightforward. But for a computer, the test should be almost impossible to solve.
This sort of test is a CAPTCHA. They're also known as a type of Human Interaction Proof (HIP). You've
probably seen CAPTCHA tests on lots of Web sites. The most common form of CAPTCHA is an image of several
distorted letters. It's your job to type the correct series of letters into a form. If your letters match the ones in the distorted
image, you pass the test.
CAPTCHAs are short for Completely Automated Public Turing test to tell Computers and Humans Apart. The
term "CAPTCHA" was coined in 2000 by Luis Von Ahn, Manuel Blum, Nicholas J. Hopper (all of Carnegie Mellon
University, and John Langford (then of IBM). They are challenge-response tests to ensure that the users are indeed
human. The purpose of a CAPTCHA is to block form submissions from spam bots – automated scripts that harvest email
addresses from publicly available web forms. A common kind of CAPTCHA used on most websites requires the users to
enter the string of characters that appear in a distorted form on the screen.
CAPTCHAs are used because of the fact that it is difficult for the computers to extract the text from such a
distorted image, whereas it is relatively easy for a human to understand the text hidden behind the distortions. Therefore,
the correct response to a CAPTCHA challenge is assumed to come from a human and the user is permitted into the
website.
Why would anyone need to create a test that can tell humans and computers apart? It's because of people trying
to game the system -- they want to exploit weaknesses in the computers running the site. While these individuals
probably make up a minority of all the people on the Internet, their actions can affect millions of users and Web sites. For
example, a free e-mail service might find itself bombarded by account requests from an automated program. That
automated program could be part of a larger attempt to send out spam mail to millions of people. The CAPTCHA test
helps identify which users are real human beings and which ones are computer programs.
Spammers are constantly trying to build algorithms that read the distorted text correctly. So strong CAPTCHAs
have to be designed and built so that the efforts of the spammers are thwarted.
1.1. Background
The need for CAPTCHAs rose to keep out the website/search engine abuse by bots. In 1997, AltaVista sought ways
to block and discourage the automatic submissions of URLs into their search engines. Andrei Broder, Chief Scientist of
AltaVista, and his colleagues developed a filter. Their method was to generate a printed text randomly that only humans
could read and not machine readers. Their approach was so effective that in a year, ―spam-add-ons‘‖ were reduced by
95% and a patent was issued in 2001.
In 2000, Yahoo‘s popular Messenger chat service was hit by bots which pointed advertising links to annoying human
users of chat rooms. Yahoo, along with Carnegie Mellon University, developed a CAPTCHA called EZ-GIMPY, which