HTML and CSS Reference
the keywords x-slow, slow, medium, fast , and x-fast , corresponding to
80, 120, 180, 300, and 500 words per minute, respectively. The faster
keyword sets the rate to 40 words per minute faster than the containing
element, and slower sets the rate to 40 words per minute slower than
the containing element.
The voice-family property is the aural analog of the font-family prop-
erty. A voice family defines a style and type of speech. Such definitions
are browser and platform specific, much like fonts. It is assumed that
browsers will define generic voice families, such as "male," "female,"
and "child," and may also offer specific voice families like "television an-
nouncer" or "book author." The value of the voice-family property is a
comma-separated list of these voice family names; the browser goes
down the list until it finds a voice family that it can use to speak the
The pitch property controls the average pitch, with units in hertz ( hz ),
of the spoken content. The basic pitch of a voice is defined by the voice
family. Altering the pitch lets you create a variation of the basic voice,
much like changing the point size of a font. For example, with a change
in pitch, the "book author" might be made to sound like a chipmunk. [*]
[*] Assuming, of course, that she doesn't already sound like a chipmunk.
You can set the pitch property to a numeric value such as 120hz or 210hz
(the average pitches of typical male and female voices) or to one of the
keywords x-low, low, medium, high , or x-high . Unlike other speech prop-
erty keywords, these do not correspond to specific pitch frequencies but
instead depend on the base pitch of the voice family. The only require-
ment is that these keywords correspond to increasingly lower or higher
While the pitch property sets the average pitch, the pitch-range prop-
erty defines how far the pitch can change as the browser reproduces
text aurally. The value of this property is a numeric value ranging from 0
to 100, with a default value of 50. Setting the pitch-range to 0 produces
a flat, monotonic voice; values over 50 produce increasingly animated
and excited-sounding voices.