Database Reference
In-Depth Information
[hadoop@hc1nn data]$ hdfs dfs -ls /data/uk_trade
Found 22 items
-rw-r--r-- 2 hadoop hadoop 355466 2014-09-16 18:09 /data/uk_trade/UKTI_FEBRUARY_2013.csv
-rw-r--r-- 2 hadoop hadoop 231177 2014-09-16 18:09 /data/uk_trade/ukti-admin-spend-apr-2011.
csv
.......
The necessary CSV-based data set now resides in HDFS under the directory /data/uk_trade, so I can start using
the data in Hive Query Language (Hive QL) statements. For example, the following CREATE TABLE statement creates
the rawtrans (raw transaction) table in the trade database. (Again, the IF NOT EXISTS clause ensures it is created only
if it does not exist.) The table is linked to the HDFS directory /data/uk_trade/, which contains the UK trade expense
information in CSV format via a LOCATION clause. As the data is in CSV format, the external table specifies that the
fields are delimited by commas, as specified by the ROW FORMAT DELIMITED FIELDS TERMINATED BY clause:
as a rule, reserved words appear in uppercase to make the examples clearer, although hive Ql is not
case-sensitive.
Note
CREATE TABLE IF NOT EXISTS
trade.rawtrans
(
dept STRING,
entity STRING,
paydate STRING,
exptype STRING,
exparea STRING,
supplier STRING,
trans STRING,
amount DOUBLE
)
ROW FORMAT DELIMITED FIELDS TERMINATED BY ','
LOCATION '/data/uk_trade';
The columns in the table are separated by commas and are bound by parentheses. The column name is in
lowercase (for instance, dept ), followed by the data type in uppercase (such as STRING ). The name of the table to be
created is “rawtrans,” and it resides in the trade database. So the name trade.rawtrans refers to the rawtrans table in
the trade database.
The number of columns in the CSV files must match the columns in the SELECT statement. Also, it is good
practice to use meaningful names to represent each column—it avoids confusion later.
I can now access the data in the external table, as a simple COUNT(*) shows. This returns the result that there are
18,976 rows in this external table:
SELECT COUNT(*) FROM trade.rawtrans
 
 
Search WWH ::




Custom Search