Java Reference
In-Depth Information
If you look back at the special repetition characters table, you'll see that they apply to the item
preceding them. This can be a character, or, where they have been grouped by means of parentheses,
the previous group of characters.
However, there is a potential problem with the regular expression you just defined. As well as matching
VBScript
and
JavaScript
, it also matches
VBJavaScript
. This is clearly not exactly what you meant.
To get around this you need to make use of both grouping and the special character
|
, which is the
alternation character. It has an or‐like meaning, similar to
| |
in
if
statements, and will match the
characters on either side of itself.
Let's think about the problem again. You want the pattern to match
VBScript
or
JavaScript
.
Clearly they have the
Script
part in common. So what you want is a new word starting with
Java
or starting with
VB
; either way, it must end in
Script
.
First, you know that the word must start with a word boundary:
\b
Next you know that you want either
VB
or
Java
to be at the start of the word. You've just seen that in
regular expressions
|
provides the “or” you need, so in regular expression syntax you want the following:
\b(VB|Java)
This matches the pattern
VB
or
Java
. Now you can just add the
Script
part:
\b(VB|Java)Script\b
Your final code looks like this:
var myString = "JavaScript, VBScript and Perl";
var myRegExp = /\b(VB|Java)Script\b/gi;
myString = myString.replace(myRegExp, "xxxx");
alert(myString);
reusing groups of Characters
You can reuse the pattern specified by a group of characters later on in the regular expression. To
refer to a previous group of characters, you just type
\
and a number indicating the order of the
group. For example, you can refer to the first group as
\1
, the second as
\2
, and so on.
Let's look at an example. Say you have a list of numbers in a string, with each number separated by
a comma. For whatever reason, you are not allowed to have two instances of the same number in a
row, so although
009,007,001,002,004,003
would be okay, the following:
007,007,001,002,002,003
would not be valid, because you have
007
and
002
repeated after themselves.