Java Reference
In-Depth Information
Figure 9-9
Now the position between a letter, number, or underscore and another letter, number, or underscore
is considered a non-word boundary and is replaced by an
|
in the example. However, what is slightly
confusing is that the boundary between two non-word characters, such as an exclamation mark and a
comma, is also considered a non-word boundary. If you think about it, it actually does make sense, but
it's easy to forget when creating regular expressions.
You'll remember this example from when you started looking at regular expressions:
<!DOCTYPE html PUBLIC “-//W3C//DTD XHTML 1.0 Transitional//EN”
“http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd”>
<html xmlns=”http://www.w3.org/1999/xhtml”>
<body>
<script language=”JavaScript” type=”text/JavaScript”>
var myString = “Paul, Paula, Pauline, paul, Paul”;
var myRegExp = /Paul/gi;
myString = myString.replace(myRegExp, “Ringo”);
alert(myString);
</script>
</body>
</html>
You used this code to convert all instances of
Paul
or
paul
to
Ringo
.
However, you found that this code actually converts all instances of
Paul
to
Ringo
, even when the
word
Paul
is inside another word.
One way to solve this problem would be to replace the string
Paul
only where it is followed by a non-
word character. The special character for non-word characters is
\W
, so you need to alter the regular
expression to the following:
var myRegExp = /Paul\W/gi;
This gives the result shown in Figure 9-10.
Figure 9-10