Database Reference
In-Depth Information
• When you create a table, include a primary key or unique index to prevent dupli‐
cates from being added to the table. MySQL uses the index as a constraint to enforce
the requirement that each row in the table contains a unique key in the indexed
column or columns.
• In conjunction with a unique index, the INSERT IGNORE and REPLACE statements
enable you to handle insertion of duplicate rows gracefully without generating er‐
rors. For bulk-loading operations, the same options are available in the form of the
IGNORE or REPLACE modifiers for the LOAD DATA statement.
• To determine whether a table contains duplicates, use GROUP BY to categorize rows
into groups, and COUNT() to see how many rows are in each group. Chapter 8
describes these techniques in the context of producing summaries, but they're useful
for duplicate counting and identification as well. A counting summary groups val‐
ues into categories to determine how frequently each one occurs.
SELECT DISTINCT removes duplicate rows from a result set (see Recipe 3.4 for more
information). For an existing table that already contains duplicates, you can select
unique rows into a second table and use it to replace the original table. Or, if you
determine that there are n identical rows in a table, you can use DELETE LIMIT
to eliminate n -1 instances from that specific set of rows.
Scripts related to the examples shown in this chapter are located in the dups directory
of the recipes distribution. For scripts that create the tables used here, look in the tables
directory.
16.1. Preventing Duplicates from Occurring in a Table
Problem
You want to prevent a table from ever containing duplicates.
Solution
Use a PRIMARY KEY or a UNIQUE index.
Discussion
To ensure that rows in a table are unique, some column or combination of columns
must be required to contain unique values in each row. When this requirement is sat‐
isfied, you can refer to any row in the table unambiguously by using its unique identifier.
To make sure a table has this characteristic, include a PRIMARY KEY or UNIQUE index in
the table structure. The following table contains no such index, so it permits duplicate
rows:
Search WWH ::




Custom Search