Database Reference
In-Depth Information
Recipe 16.4 discusses use of make_dup_count_query() to implement a duplicate-
removal technique.
Summary techniques are useful for assessing the existence of duplicates, how often they
occur, and displaying which values are duplicated. But if duplicates are determined using
only a subset of a table's columns, a summary in itself cannot display the entire content
of the rows that contain the duplicate values. (For example, the summaries shown thus
far display counts of duplicated names in the catalog_list table or the names them‐
selves, but don't show the addresses associated with those names.) To see the original
rows containing the duplicate names, join the summary information to the table from
which it's generated. The following example shows how to do this to display the cata
log_list rows that contain duplicated names. The summary is written to a temporary
table, which then is joined to the catalog_list table to produce the rows that match
those names:
mysql> CREATE TABLE tmp
-> SELECT COUNT(*) AS count, last_name, first_name FROM catalog_list
-> GROUP BY last_name, first_name HAVING count > 1;
mysql> SELECT catalog_list.*
-> FROM tmp INNER JOIN catalog_list USING (last_name, first_name)
-> ORDER BY last_name, first_name;
+-----------+------------+----------------------+
| last_name | first_name | street |
+-----------+------------+----------------------+
| Baxter | Wallace | 57 3rd Ave. |
| BAXTER | WALLACE | 57 3rd Ave. |
| Baxter | Wallace | 57 3rd Ave., Apt 102 |
| Pinter | Marlene | 9 Sunset Trail |
| Pinter | Marlene | 9 Sunset Trail |
+-----------+------------+----------------------+
Duplicate Identification and String Case Sensitivity
For strings that have a case-insensitive collation, values that differ only in lettercase are
considered the same for comparison purposes. To treat them as distinct values, compare
them using a case-sensitive or binary collation. Recipe 5.7 shows how to do this.
16.4. Eliminating Duplicates from a Table
Problem
You want to remove duplicate rows from a table, leaving only unique rows.
 
Search WWH ::




Custom Search