Information Technology Reference
In-Depth Information
ing legitimate traffic. If they do come from a fixed set of sources, simply not responding to
the requests still hogs bandwidth used to receive the attack—and that alone can overload
a network. The attack must be blocked from outside your network, usually by the ISP you
connect to. Most ISPs do not provide this kind of filtering.
The best defense is to simply have more bandwidth than the attacker. This is very diffi-
cult considering that most DDoS attacks involve thousands of machines. Very large com-
panies are able to use this line of defense. Smaller companies can use DDoS attack mitig-
ation services. Many CDN vendors (see Section 5.8 ) provide this service since they have
bandwidth available.
Sometimes a DDoS attack does not aim to exhaust bandwidth but rather to consume
large amounts of processing time or load. For example, one might find a small query that
demandsalargeamountofresourcestoreplyto.Inthiscasethebannedquerylistdescribed
previously can be used to block this query until a software release fixes the problem.
6.7.3 Scraping Attacks
Ascraping attack isanautomated process that acts like awebbrowsertoqueryforinform-
ation and then extracts (scrapes) the useful information from the HTML pages it receives.
For example, if you wanted a list of every topic ever published but didn't want to pay for
such a database from a library supply company, you could write a program that sends mil-
lionsofsearchrequeststoAmazon.com,parse the HTMLpages,and extract the topic titles
to build your database. This use of Amazon is considered an attack because it violates the
company's terms of service.
Such an attack must be defended against to prevent theft of information, to prevent
someone from violating the terms of service, and because a very fast scraper is equivalent
to a DoS attack. Detecting such an attack is usually done by having all frontends report in-
formation about the queries they are receiving to a central scraping detector service.
The scraping detector warns the frontends of any suspected attacks. If there is high con-
fidence that a particular source is involved in an attack, the frontends can block or refuse
to answer the queries. If confidence in the source of the attack is low, the frontends can re-
spond in other ways. For example, they can ask the user to prove that he or she is a human
by using a Captcha or other system that can distinguish human from machine input.
Some scraping is permitted, even desired. A scraping detector should have a whitelist
that permits search engine crawlers and other permitted agents to do their job.
Search WWH ::




Custom Search