Database Reference
In-Depth Information
Solution
Use a counting summary that displays duplicated values. To see the rows in which the
duplicated values occur, join the summary to the original table to display the matching
rows.
Discussion
Suppose that your website has a sign-up page that enables visitors to add themselves to
your mailing list to receive periodic product catalog mailings. But you forgot to include
a unique index in the table when you created it, and now you suspect that some people
are signed up multiple times. Perhaps they forgot they were already on the list, or perhaps
people added friends to the list who were already signed up. Either way, the result of
having duplicate rows is that you mail out duplicate catalogs. This is an additional
expense to you, and it annoys the recipients. This section discusses how to determine
whether there are duplicate rows in a table, how prevalent they are, and how to display
them. (For tables that do contain duplicates, Recipe 16.4 describes how to eliminate
them.)
To determine whether duplicates occur in a table, use a counting summary (a topic
covered in Chapter 8 ). Summary techniques can be applied to identifying and counting
duplicates by grouping rows with GROUP BY and counting the rows in each group using
COUNT() . For the examples here, assume that catalog recipients are listed in a table named
catalog_list that has the following contents:
+-----------+-------------+--------------------------+
| last_name | first_name | street |
+-----------+-------------+--------------------------+
| Isaacson | Jim | 515 Fordam St., Apt. 917 |
| Baxter | Wallace | 57 3rd Ave. |
| McTavish | Taylor | 432 River Run |
| Pinter | Marlene | 9 Sunset Trail |
| BAXTER | WALLACE | 57 3rd Ave. |
| Brown | Bartholomew | 432 River Run |
| Pinter | Marlene | 9 Sunset Trail |
| Baxter | Wallace | 57 3rd Ave., Apt 102 |
+-----------+-------------+--------------------------+
Suppose that you define “duplicate” using the last_name and first_name columns.
That is, recipients with the same name are assumed to be the same person. The following
statements characterize the table and assess the existence and extent of duplicate values:
• The total number of rows in the table:
mysql> SELECT COUNT(*) AS rows FROM catalog_list;
+------+
| rows |
+------+
Search WWH ::




Custom Search