Database Reference
In-Depth Information
Type: varchar(255)
Collation: latin1_bin
Null: YES
Key:
Default: NULL
Extra:
Privileges: select,insert,update,references
Comment:
This shows information just on the
common_name
column. Notice that the
Collation
is
latin1_bin
. Because of that, regular expressions using
REGEXP
are case sensitive
without having to add the
BINARY
option.
Looking through the
birds
table, we discover some common names for birds that con-
tain the words, “Hawk Owls,” without the hyphen in between. We didn't allow for that in
the expression we gave. We discover also that there are birds in which the word “Hawk”
is not in title case — so we can't count on looking for the uppercase letter,
H
. Our previ-
ous regular expression left those birds out of the results. So we'll have to change the ex-
pression and try a different method. Enter this on your server:
SELECT
common_name
AS
'Hawks'
FROM
birds
WHERE
common_name
REGEXP
'[[:space:]]Hawk|[[.hyphen.]]Hawk'
AND
common_name
NOT REGEXP
'Hawk-Owl|Hawk Owl'
ORDER BY
family_id
;
This first, rather long
REGEXP
expressionuses a
character class
and a
character name
.
The format of character classes and character names is to put the type of character
between two sets of double brackets. A character class is given between a pair of colons
(e.g.,
[[:alpha:]]
for alphabetic characters). A character name is given between two
dots (e.g.,
[[.hyphen.]]
for a hyphen). Looking at the first expression, you can de-
duce that we want rows in which the
common_name
contains either “Hawk” or “-Hawk”
— that is to say,
Hawk
preceded by a space or a hyphen. This won't allow for
Hawk
pre-
ceded by a letter (e.g.,
Nighthawk
). The second expression excludes
Hawk-Owl
and
Hawk
Owl
.
Pattern matching in regular expressions in MySQL tends to be more verbose than they are
in other languages like Perl or PHP. But they do work for basic requirements. For elabor-
ate regular expressions, you'll have to use an API like the Perl DBI to process the data
outside of MySQL. Because that may be a performance hit, it's better to try to accomplish
such tasks withinMySQL using
REGEXP
.