HTML and CSS Reference
foo%'; DELETE * FROM pages; SELECT * FROM pages
WHERE content LIKE '%
This closes the initial search early, adds a second SQL command that deletes all data from the pages table, and
then performs the first part of the next query to make sure the complete set of statements is syntactically
Don't fool yourself into thinking that nobody could possibly figure out how to do this. These attacks happen all
the time, and hackers have gotten very good at working their way into databases in this fashion. Although the
details depend on the database structure and the back-end code, this is not a theoretical attack.
Start with libraries such as Hibernate's Criteria API that clearly separate the data from the code. Use
parameterized queries and stored procedures where possible. When that's not possible, make sure you escape
all significant characters such as ' and " before forming the string.
Many APIs have built-in functions to perform this escaping. For example, PHP provides the
mysql_real_escape_string function for escaping strings in a form suitable for use in MySQL. You could safely
write the preceding code as follows:
$keywords = $_GET['terms'];\
$query = "SELECT url, title FROM pages WHERE content LIKE
'%"$safe_keywords = mysql_real_escape_string($keywords);
. $safe_keywords . "%'";
$result = mysql_query($query);
Many other environments have something similar.
You can limit the possible damage by restricting the access your web server has to the database. Most
databases have functions for assigning different users different levels of privilege. Many queries can be run as a
user who has only SELECT access to the database but not INSERT, UPDATE, or DELETE. This will limit the
damage an attacker can do. However, attackers may still be able to trick the database into revealing
information that was supposed to be secret, even if they can't modify it.
A number of automated security testing tools such as Sprajax and PHP Security Scanner look for such problems.
By all means use them. However, the nature of code means these tools are not perfectly adequate. They can
miss some problems while reporting many false positives. The best solution is to carefully review all code that
accepts input from the user with an eye toward these sorts of security problems. Make sure every point where
input is accepted from outside the program is properly escaped as appropriate for your environment.
I've focused on SQL injection here because it's by far the most common instance of this pattern. However, other
non-SQL systems may be vulnerable as well. Anytime you take data from a user and execute it as code, you're
at risk. This can crop up in XPath, XQuery, XSLT, LDAP, and other systems as well.
Though details vary, the defenses follow the same basic pattern. Carefully inspect all user-supplied data for
reserved characters (whatever those may be in the language of concern) and escape them. If possible, use
vendor-supplied escaping functions instead of hand-rolled ones. Never treat client-supplied data as code.