HTML and CSS Reference
In-Depth Information
•
Meta schemes specify a semantic framework defining the meaning of the key and its value
(prior to HTML5). They can also prevent potential ambiguity. Listing 7-3 shows an example.
Listing 7-3.
A Meta Scheme
<meta name="foo" content="bar" scheme="DC" />
In this case, the meta scheme is Dublin Core (DC).
The
language
,
keywords
,
description
, and
robots
attributes contribute to more precise web searches by
defining document language, the most relevant keywords, and a short description. The value of the last attribute,
robots
, provides control over search engine behavior for a limited extent [23]. Web pages can be prevented from
being indexed (
noindex
), crawled (
nofollow
), cached (
noarchive
), described (
nosnippet
), or described according to
the
Open Directory Project
(
noodp
) [24]. The combination of the
noindex, nofollow
values can be substituted by the
value
none
[25]. This setting can be used, for example, for confidential documents whose content and links should not
be indexed by search engines.
1
Web page descriptions retrieved from ODP used by Google, Yahoo!, and Bing can be
disallowed specifically. The
meta name
to be applied is
Googlebot
for Google,
Slurp
for Yahoo!, and
msnbot
for Bing
(Listing 7-4).
Listing 7-4.
meta
Tags for Different Crawlers
<meta name="Googlebot" content="noodp" />
<meta name="Slurp" content="noodp" />
<meta name="msnbot" content="noodp" />
If you want to prevent the descriptions and titles retrieved from the Yahoo! Directory from being displayed in
search results, you can use the
noydir
value [26] (Listing 7-5).
Listing 7-5.
Using the
noydir
Attribute Value
<meta name="robots" content="noydir" />
In spite of the variety of attribute values, using
meta
tags for preventing search engine indexing or crawling is not
the best solution. The
robots.txt
file should be used instead for this purpose.
The typical general metadata provided in the head section of web documents looks like Listing 7-6.
Listing 7-6.
A Complete Example for
meta
Tags in XHTML5
<meta charset="UTF-8" />
<meta name="robots" content="index, follow" />
<meta name="content-language" content="en" />
<meta name="author" content="John Smith" />
<meta name="keywords" content="My Darling, pet shop, pet accessories, dog, collar,
harness, dog lead, dog kennel, dog bowl, dog coats" />
<meta name="description" content="The website of the pet shop My Darling." />
Since the attribute value of the
name
attribute on the
meta
element is
robots
, the value of the
content
attribute
(
index, follow
) is applied to all search engines rather than a specific one.
1
There are other techniques to achieve similar results. For example, web documents contained by a directory that is disallowed in
robots.txt
will usually be excluded from search results.
Search WWH ::
Custom Search