Databases Reference
In-Depth Information
you to remove duplicates. We've added it into this section because, like aggregate
functions, it's concerned with picking examples from the output of a query, rather than
processing individual rows.
An example is the best way to understand
DISTINCT
. Consider this query:
mysql>
SELECT DISTINCT artist_name FROM
-> artist INNER JOIN album USING (artist_id);
+---------------------------+
| artist_name |
+---------------------------+
| New Order |
| Nick Cave & The Bad Seeds |
| Miles Davis |
| The Rolling Stones |
| The Stone Roses |
| Kylie Minogue |
+---------------------------+
6 rows in set (0.03 sec)
The query finds artists who have made albums—by joining together
artist
and
album
with an
INNER JOIN
clause—and reports one example of each artist. You can see
that we have six artists in our database for whom we own albums. If you remove the
DISTINCT
clause, you get one row of output for each album we own:
mysql>
SELECT artist_name FROM
-> artist INNER JOIN album USING (artist_id);
+---------------------------+
| artist_name |
+---------------------------+
| New Order |
| New Order |
| New Order |
| New Order |
| New Order |
| New Order |
| New Order |
| Nick Cave & The Bad Seeds |
| Miles Davis |
| Miles Davis |
| The Rolling Stones |
| The Stone Roses |
| Kylie Minogue |
+---------------------------+
13 rows in set (0.00 sec)
So, the
DISTINCT
clause helps get a summary.
The
DISTINCT
clause applies to the query output and removes rows that have identical
values in the columns selected for output in the query. If you rephrase the previous
query to output both
artist_name
and
album_name
(but otherwise don't change the
JOIN
clause and still use
DISTINCT
), you'll get all 13 rows in the output:
mysql>
SELECT DISTINCT artist_name, album_name FROM
-> artist INNER JOIN album USING (artist_id);