HTML and CSS Reference
In-Depth Information
Hide E-mail Addresses
E-mail addresses published on web pages should be encoded to prevent spambots from harvesting them.
<a href="elharo@metalab.unc.edu">E-mail Elliotte Harold<a/>
elharo@macfaq.com
<a href="&#109;&#97;&#105;&#108;&#116;&#111;&#58;
elharo%40metalab%2Eunc%2Eedu">E-mail Elliotte Harold</a>
elharo&#x40;macfaq&#x2E;&#x43;om
Motivation
Spammers run spiders that screen-scrape HTML pages for e-mail addresses to spam. However, the spiders
aren't especially smart, don't follow the relevant specifications, and thus can usually be fooled fairly easily.
Potential Trade-offs
Taken to extremes, hiding e-mail addresses from spambots hides them from your legitimate customers and
readers, too. You don't want to do this. Don't go overboard. Make sure your applications allow people to find
you. No solution will be perfect. You cannot block all spammers and let in all humans, but it is far more
important not to block any humans than it is to keep out the last 1% of spam robots.
Mechanics
Finding e-mail addresses is fairly straightforward. This regular expression will pick up most of them:
[\w\-\.\+]+@([\w\-]+\.)+[a-zA-Z]{2,7}
You can also search for mailto: to find mailto links. Indeed, it is the ease of mechanically extracting e-mail
addresses from text that makes spambots so effective. Most spambots don't do anything more sensitive than
this very search. That's what makes it possible to fool them.
The first and most obvious technique is to break the address in a way that's easy for a human to repair but hard
for a robot. For example:
elharo /at/ metalab.unc.edu
or:
elharo@delete.this.part.metalab.unc.edu
The problem is that this keeps the addresses from being copied and pasted without manual editing. Thus, I
prefer not to do this.
Some people embed the e-mail address in an image instead:
Search WWH ::




Custom Search