Database Reference
In-Depth Information
%
sqoop import --connect jdbc:mysql://localhost/hadoopguide \
>
--table widgets -m 1 --hive-import
NOTE
Running
sqoop import
with the
--hive-import
argument will load the data directly from the
source database into Hive; it infers a Hive schema automatically based on the schema for the table in the
source database. Using this, you can get started working with your data in Hive with only one command.
Regardless of which data import route we chose, we can now use the
widgets
dataset
and the
sales
dataset together to calculate the most profitable zip code. Let's do so, and
also save the result of this query in another table for later:
hive>
CREATE TABLE zip_profits
>
AS
>
SELECT SUM(w.price * s.qty) AS sales_vol, s.zip FROM SALES s
>
JOIN widgets w ON (s.widget_id = w.id) GROUP BY s.zip;
...
Moving data to: hdfs://localhost/user/hive/warehouse/zip_profits
...
OK
hive>
SELECT * FROM zip_profits ORDER BY sales_vol DESC;
...
OK
403.71 90210
28.0 10005
20.0 95014