HTML and CSS Reference
The spec says that <bdi> “represents a span of text that is to be
isolated from its surroundings for the purposes of bidirectional
text formatting.” Huh? I asked Richard Ishida (@r12a on Twitter—
follow him!), the W3C's internationalization lead, to explain this
to me, and he was kind enough to write a blog post in response
with his permission.
The HTML5 specification contains a bunch of new features to
support bidirectional text in web pages. Languages written with
right-to-left scripts—such as Arabic, Hebrew, Persian, Thaana,
Urdu, and so on—commonly mix in words or phrases in English
or some other language that uses a left-to-right script. The result
is called bidirectional or bidi text.
HTML 4.01, coupled with the Unicode Bidirectional algorithm,
already does a pretty good job of managing bidirectional text,
but there are still some problems when dealing with embedded
text from user input or from stored data.
Here's an example where the names of restaurants are added
to a page from a database. This is the code, with the Hebrew
shown using ASCII:
<p>Aroma - 3 reviews</p>
<p>PURPLE PIZZA - 5 reviews</p>
Figures 2.19 and 2.20 show what you'd expect to see, and what
you'd actually see, respectively.
The problem arises because the browser thinks that the “-5” is
part of the Hebrew text. This is what the Unicode Bidi Algorithm
tells it to do, and usually it is correct. Not here though.
So the question is how to fix it? The trick is to use the <bdi> ele-
ment around the text to isolate it from its surrounding content.
(bdi stands for ”bidi-isolate.”)
<p><bdi>Aroma</bdi> - 3 reviews</p>
<p><bdi>PURPLE PIZZA</bdi> - 5 reviews</p>
The bidi algorithm now treats the Hebrew and “- 5” as separate
chunks of content, and orders those chunks according to the
direction of the overall context (in this instance, from left to right).
FIguRE 2.19 How we'd like
our web page to look.
FIguRE 2.20 How our bidi
page actually looks. Note
the numeral “5” has been
separated from the word
“reviews.” The content is now