Handling Duplicates - MySQL

Database Reference

In-Depth Information

| 8 |

+------+

• The number of distinct names:

mysql> SELECT COUNT(DISTINCT last_name, first_name) AS 'distinct names'

-> FROM catalog_list;

+----------------+

| distinct names |

+----------------+

| 5 |

+----------------+

• The number of rows containing duplicated names:

mysql> SELECT COUNT(*) - COUNT(DISTINCT last_name, first_name)

-> AS 'duplicate names'

-> FROM catalog_list;

+-----------------+

| duplicate names |

+-----------------+

| 3 |

+-----------------+

• The fraction of the rows that contain unique or nonunique names:

mysql> SELECT COUNT(DISTINCT last_name, first_name) / COUNT(*)

-> AS 'unique',

-> 1 - (COUNT(DISTINCT last_name, first_name) / COUNT(*))

-> AS 'nonunique'

-> FROM catalog_list;

+--------+-----------+

| unique | nonunique |

+--------+-----------+

| 0.6250 | 0.3750 |

+--------+-----------+

Those statements help you characterize the extent of duplicates, but they don't show

you which values are duplicated. To see the duplicated names in the catalog_list table,

use a summary statement that displays the nonunique values along with the counts:

mysql> SELECT COUNT(*), last_name, first_name

-> FROM catalog_list

-> GROUP BY last_name, first_name

-> HAVING COUNT(*) > 1;

+----------+-----------+------------+

| COUNT(*) | last_name | first_name |

+----------+-----------+------------+

| 3 | Baxter | Wallace |

| 2 | Pinter | Marlene |

+----------+-----------+------------+

MySQL

Search WWH ::

Custom Search

Home