Java Reference
In-Depth Information
Origin of CAPTCHAs
The first discussion of automated tests, which distinguish humans from computers for
the purpose of controlling access to web services, appears in a 1996 manuscript of Moni Naor
from the Weizmann Institute of Science. Primitive CAPTCHAs seem to have been later devel-
oped in 1997 at AltaVista by Andrei Broder and his colleagues, to prevent bots from adding
URLs to their search engine. The team sought to make their CAPTCHA resistance with an
Optical Character Recognition (OCR) attack. The team looked at the manual to their Brother
scanner, which included recommendations for improving OCR results.
These recommendations included similar typefaces, and plain backgrounds. The team
created puzzles by attempting to simulate what the manual claimed would cause bad OCR.
In 2000, von Ahn and Blum developed and publicized the notion of a CAPTCHA, which in-
cluded any program that could distinguish humans from computers. They invented multiple
examples of CAPTCHAs, including the first CAPTCHAs to be widely used (at Yahoo!).
Accessibility Concerns
CAPTCHAs are usually based on reading text. This can present a problem for blind or vi-
sually impaired users who would like to access the protected resource. However, CAPTCHAs
do not necessarily have to be visual. Any hard, artificial intelligence problem, such as speech
recognition, could be used as the basis of a CAPTCHA. Some implementations of CAPTCHAs
permit visually impaired users to opt for an audio CAPTCHA.
Because CAPTCHAs are designed to be unreadable by machines, common assisted tech-
nology tools, such as screen readers, cannot interpret them. Since sites may use CAPTCHAs
as part of the initial registration process, or even every login, this challenge can completely
block some access. In certain jurisdictions, site owners could become a target for litigation if
they are using CAPTCHAs that discriminate against certain people with disabilities.
Circumvention of CAPTCHAs
There are a number of means that unethical bot writers use to defeat CAPTCHAs. If a
web master has taken the time to insert a CAPTCHA, they surely do not want bots to access
their site. Although, this chapter will not demonstrate how to circumvent a CAPTCHA, it will
discuss some of the methods used to circumvent a CAPTCHA, so you are aware of both sides
of this “battle”. Some of the more common means to circumvent CAPTCHAs are listed here:
• Optical Character Recognition (OCR)
• Cheap Human Labor
• Insecure Implementation
Optical Character Recognition is a computer process that converts images to ASCII text.
This is often used for FAX documents. By using OCR, you can capture the text image of the
FAX, and import it into a word processor for editing. OCR technology can also be used to
circumvent a CAPTCHA. However, most modern CAPTCHAs take steps to make it very dif-
ficult for traditional OCR technology to read them.
Search WWH ::




Custom Search