HTML and CSS Reference
Greedy and Nongreedy Matches
By default, all matches are greedy. That is, they match the maximum length of text they can get away with. For
example, suppose you have the following paragraph:
<q id='g1'>Take your seats,</q> said the guard.<q id='g2'>Going
by the train, sir?</q>
Now suppose you want to match all the q start tags, and consequently you use the regular expression <q.*> . In
fact, this will find:
<q id='g1'>Take your seats,</q> said the guard. <q id='g2'>
The regular expression <q.*> matches everything from the first <q on the line to the last > . The only reason it
stops there is that the period does not match line breaks. The match is said to be greedy .
You specify a nongreedy match that stops at the first opportunity by putting a question mark after the
quantifier. You can also use such a question mark after another question mark or after a plus sign. Thus, if I had
written the regular expression as <q.*?> , it would have stopped with the start-tag <q id='g1'> .
You can also use nongreedy matches with the other quantifiers, such as ? and + . For example, a+? will match at
least one a , but then it will stop if it can. However, if this is part of a larger pattern, such as a*?b or a+?b , it will
match as many a 's as it needs to get to the first b.