HTML and CSS Reference
In-Depth Information
Of course, Linklint does not spider the remote pages. It merely checks that they're where they're expected to
be. As you can see, a link can be broken for several reasons. These include:
could not find ip address
The entire host has been removed from the Net and has not been replaced. In some cases, you can find
the replacement host with a little Google work. Otherwise, you should delete the link.
moved
The page has moved to a new location, but the host is still there. If the ultimate response is OK, you can
update the link, but it's not essential and not an immediate problem. If the location is not found, though,
you need to fix the link.
access forbidden
Usually this means the directory has been deleted. You'll need to fix or delete the link.
timed out
The host is still there, but it doesn't seem to be responding at the moment. It may be a temporary glitch,
or the site may be gone for good. Try again tomorrow.
The exact terms vary from one tool to the next, but the reasons and responses are the same.
Note that this process does not just find broken links. It also finds redirected links—that is, links where the
server sends the browser to a new page. These are worth a second look, especially if the server the user is
being redirected to is not the original server. Too often, the browser is being redirected to a spam page that's
been set up at a dead domain. Other times it's the home page of the correct site, but the page you were
actually linking to is missing. You may need to verify these manually.
In many ways, checking for broken links—especially external links—is one of the most annoying refactorings.
Most of the refactorings described in this topic have the advantage of being stable. That is, once you fix a page,
it stays fixed, at least until someone edits it. Not so with link checking. A page can be perfectly fine one minute
and have two dozen broken links the next, and there's not a lot you can do about it. The best you can hope for
is to notice and quickly fix any problems that do arise. For example, set up a cron job that runs Linklint
periodically and e-mails you the results. You can't stop other sites from breaking your links to them, but you can
at least repair the problems as time permits.
Repairing Links
Sometimes you can fix links automatically with search and replace. For example, when Sun's Java site changed
its URL from java.sun.com to www.javasoft.com , it was easy for me to replace all the old links with new ones
just by searching for java.sun.com and changing it to www.javasoft.com . Then when Sun changed the host
name back to java.sun.com a few years later, I just did the search and replace in reverse.
Most changes aren't this easy. You'll often need to spend some time surfing the targeted site and Googling for
new page locations. Sometimes you'll find them. Sometimes you won't. If you do find them, updating the old
link to point to the new location is easy. If you don't find it, delete the link. Depending on context, you can
Search WWH ::




Custom Search