Databases Reference
In-Depth Information
+---------------------------+------------------------------------------+
| artist_name | album_name |
+---------------------------+------------------------------------------+
| New Order | Retro - John McCready FAN |
| New Order | Substance (Disc 2) |
| New Order | Retro - Miranda Sawyer POP |
| New Order | Retro - New Order / Bobby Gillespie LIVE |
| New Order | Power, Corruption & Lies |
| New Order | Substance 1987 (Disc 1) |
| New Order | Brotherhood |
| Nick Cave & The Bad Seeds | Let Love In |
| Miles Davis | Live Around The World |
| Miles Davis | In A Silent Way |
| The Rolling Stones | Exile On Main Street |
| The Stone Roses | Second Coming |
| Kylie Minogue | Light Years |
+---------------------------+------------------------------------------+
13 rows in set (0.00 sec)
Because none of the rows are identical, no duplicates are removed using DISTINCT . You
can check this by rephrasing the query to omit the DISTINCT clause; you'll get the same
output.
To remove duplicates, MySQL needs to sort the output. If indexes are available that
are in the same order as required for the sort—or the data itself is in an order that's
useful—this process has very little overhead. However, for large tables and without an
easy way of accessing the data in the right order, sorting can be very slow. You should
use DISTINCT (and other aggregate functions) with caution on large data sets. If you do
use it, you can check its behavior using the EXPLAIN statement discussed in Chapter 8.
The GROUP BY Clause
The GROUP BY clause sorts data into groups for the purpose of aggregation. It's similar
to ORDER BY , but it occurs much earlier in the query process: GROUP BY is used to organize
the data before other clauses—such as WHERE , ORDER BY , and functions—are applied. In
contrast, ORDER BY is applied last—after the query has been resolved—to reorganize the
query output for display.
An example will help you understand what GROUP BY is used for. Suppose you want to
know how many albums we own by each artist. Using the techniques you've learned
so far, you could perform an INNER JOIN between artist and album , and use an ORDER
BY artist_name clause to organize the artists into an order to make it easy for you to
count. Here's the query that you'd use:
mysql> SELECT artist_name FROM
-> artist INNER JOIN album USING (artist_id)
-> ORDER BY artist_name;
+---------------------------+
| artist_name |
+---------------------------+
| Kylie Minogue |
 
Search WWH ::




Custom Search