Database Reference
In-Depth Information
automobiles.txt /user/cloudera/automobiles/
automobiles.txt
[cloudera@localhost ~]$ hdfs dfs -moveFromLocal
motorcycles.txt /user/cloudera/motorcycles/
motorcycles.txt
[cloudera@localhost ~]$ hdfs dfs -ls /user/
cloudera/motorcycles/
Found 1 items
-rw-r--r-- 3 cloudera cloudera 932
2013-10-15 19:19 /user/cloudera/motorcycles/
motorcycles.txt
[cloudera@localhost ~]$ hdfs dfs -ls /user/
cloudera/automobiles/
Found 1 items
-rw-r--r-- 3 cloudera cloudera 985
2013-10-15 19:17 /user/cloudera/automobiles/
automobiles.txt
Now, we will load the preceding data into two separate tables in two different steps,
to learn various ways of loading data. The tables we are using here are external
tables instead of internal. For automobile data, I will load them directly from a script
into the automobiles table; and then I will load motorcycle data in the motor-
cycles table inside the Impala shell. In the script, I will add another empty table,
automakers . Later, we will join a list of automakers from both tables. All of this pro-
cessing will be done in a database named autos .
Loading data into the Impala table from HDFS
Here is the SQL script to create a database autos first, create the automobiles
table, and then load the whole dataset from HDFS. I am also creating an empty table
automakers in the autos_script.sql script as follows:
USE default;
DROP DATABASE IF EXISTS autos;
CREATE DATABASE autos;
USE autos;
Search WWH ::




Custom Search