HTML and CSS Reference
Meta schemes specify a semantic framework defining the meaning of the key and its value
(prior to HTML5). They can also prevent potential ambiguity. Listing 7-3 shows an example.
Listing 7-3. A Meta Scheme
<meta name="foo" content="bar" scheme="DC" />
In this case, the meta scheme is Dublin Core (DC).
The language , keywords , description , and robots attributes contribute to more precise web searches by
defining document language, the most relevant keywords, and a short description. The value of the last attribute,
robots , provides control over search engine behavior for a limited extent . Web pages can be prevented from
being indexed ( noindex ), crawled ( nofollow ), cached ( noarchive ), described ( nosnippet ), or described according to
the Open Directory Project ( noodp ) . The combination of the noindex, nofollow values can be substituted by the
value none . This setting can be used, for example, for confidential documents whose content and links should not
be indexed by search engines. 1 Web page descriptions retrieved from ODP used by Google, Yahoo!, and Bing can be
disallowed specifically. The meta name to be applied is Googlebot for Google, Slurp for Yahoo!, and msnbot for Bing
Listing 7-4. meta Tags for Different Crawlers
<meta name="Googlebot" content="noodp" />
<meta name="Slurp" content="noodp" />
<meta name="msnbot" content="noodp" />
If you want to prevent the descriptions and titles retrieved from the Yahoo! Directory from being displayed in
search results, you can use the noydir value  (Listing 7-5).
Listing 7-5. Using the noydir Attribute Value
<meta name="robots" content="noydir" />
In spite of the variety of attribute values, using meta tags for preventing search engine indexing or crawling is not
the best solution. The robots.txt file should be used instead for this purpose.
The typical general metadata provided in the head section of web documents looks like Listing 7-6.
Listing 7-6. A Complete Example for meta Tags in XHTML5
<meta charset="UTF-8" />
<meta name="robots" content="index, follow" />
<meta name="content-language" content="en" />
<meta name="author" content="John Smith" />
<meta name="keywords" content="My Darling, pet shop, pet accessories, dog, collar,
harness, dog lead, dog kennel, dog bowl, dog coats" />
<meta name="description" content="The website of the pet shop My Darling." />
Since the attribute value of the name attribute on the meta element is robots , the value of the content attribute
( index, follow ) is applied to all search engines rather than a specific one.
1 There are other techniques to achieve similar results. For example, web documents contained by a directory that is disallowed in
robots.txt will usually be excluded from search results.