Database Reference
In-Depth Information
Notice that the example uses aliases (a, b) in both SELECT statements. The subquery uses an alias (a) for the
table trade.rawtrans, and so its columns are referenced using this alias (a.supplier). The derived table b contains the
table built by the subquery in parentheses. The columns from the derived table are also referenced using an alias
(b.supplier). (This aliasing avoids confusion as to which table, real or derived, a column belongs to.)
You can also use subqueries in the WHERE clause of a SELECT statement (from Hive 0.13), as my example shows
here:
SELECT
DateConv (b.paydate) as paydate,
b.supplier,
b.amount
FROM
trade.rawtrans b
WHERE
b.supplier IN ( SELECT supplier FROM trade.uksupplier );
The data from the trade.rawtrans table with alias b is being filtered against the supplier name. The subquery
in the WHERE clause is checking that the supplier name from trade.rawtrans exists in the UK supplier list table trade.
uksupplier by using the SQL IN clause.
Table Joins
In the real world, you're seldom pulling data from just one table. What happens if you have data in two tables and you
wish to build a SELECT statement using data from both tables? You can use table joins to merge the data from multiple
tables to form a compound data set. Of course, you will need to know which columns exist in each table and that the
same data exists in each table so that the rows can be joined.
For instance, suppose two derived tables (a and b) each contain a column named “supplier”; that means I can
join them on that column. The SELECT statement that follows uses those two derived tables: the first (aliased a) selects
the “department” and “supplier” columns from the trade.rawtrans table. The second derived table (aliased b) selects
the “supplier” and “amount” columns from the same table. Even though they are taking data from the same table, they
are treated as two different derived tables.
The DISTINCT keyword is used in both subqueries to remove duplicates from the data. The derived tables are
then joined using the JOIN keyword. They are joined on the “supplier” key using the ON keyword. Note: only “equal”
joins are accepted. For instance, you could not say ON ( a.amount > b.amount ) .
SELECT
a.dept,
a.supplier,
b.amount
FROM
(
SELECT DISTINCT
c.dept,c.supplier
FROM
trade.rawtrans c
) a
 
Search WWH ::




Custom Search