HTML and CSS Reference
In-Depth Information
$ aspell --lang=en create master ./webdict < customdict.txt
This creates the file webdict in the current working directory.
Now run the command again with the --add-extra-dicts=./ webdict option, like so:
$ cat *.html | aspell --mode=sgml --add-extra-dicts=./webdict
list | sort | uniq
This time, Aspell will generate a list of words that are much more likely to be actually misspelled. When I
recently checked my site, my initial list comprised more than 11,000 misspelled words. After scanning them and
creating a custom dictionary, the potential spelling errors were reduced to 1,138, a much more manageable
number.
At this point, I would take each word in this new shorter list and search for it using this regular expression:
\bmisspelling\b
The \b on both ends limits the search to word boundaries so that I don't accidentally find it in the middle of
other words. For example, if I'm correcting adn to and , I don't want to also change sadness to sandess .
If I'm uncertain about whether a word is correctly spelled, I may open the file and fix it manually. If I know it's
obviously wrong, I'll just replace it.
The alternative to this approach, and one that may be more accessible to some people, is to use a traditional
GUI spell checker, such as those built into BBEdit and Dreamweaver, and check files one at a time. That's
certainly possible, but in my experience it takes quite a bit longer, and the larger the site, the longer it takes.
Spelling errors are not independent. The same ones tend to crop up again and again.
A nice compromise position is to use Aspell to build up a custom dictionary and then use that custom dictionary
as you check individual files, whether with Aspell or with some other GUI tools. Most decent tools should be able
to import a custom dictionary saved as a plain text file.
For many sites, that may be enough. Anything this process catches is certainly something you'll want to fix.
However, professional sites that convey your image to the world are worth a little more effort. Some things a
machine can't catch. For instance, I know of one site where a spell checker did not notice an omitted l in the
word public with consequently embarrassing results. If at all possible, I recommend hiring a professional
proofreader to catch these and similar mistakes, as well as errors of grammar, meaning, and style that a
computer program just won't recognize.
Although I usually say that hiring a professional proofreader is optional, there is one exception. If you are
publishing a site in anything other than your primary native language, professional native assistance is
mandatory. Even well-educated, truly bilingual people rarely have formal education in more than one tongue.
Any commercial site publishing in a non-native language should insist on a native proofreader.
However you go about this, correcting spelling errors can take awhile. The effort involved is roughly linear in the
size of the site after you get your initial custom dictionary set up. As usual, you may not have to do it all at
once. Start with your home page and other frequently accessed pages, as indicated by your server logs. Then
work your way forward from there. Don't feel like you have to do it all at once. Every error you correct is one
less error for site visitors to notice and to think less of you for.
Search WWH ::




Custom Search