Database Reference
In-Depth Information
LOAD DATA LOCAL INPATH 'C:/MsBigData/TestData/
customers'
OVERWRITE INTO TABLE MsBigData.customer;
Intheprecedingstatement,
OVERWRITE
indicatesthatanyfilesinthetable's
directory should be deleted before loading the new data. If it is left out, the
data files will be added to the files in the directory. The
LOCAL
keyword
indicates that the data will be copied from the local file system into the Hive
directory. The original copy of the files will be left in the local file system. If
the
LOCAL
keyword is not included, the path is resolved against the HDFS,
and the files are moved to the Hive directory, rather than being copied.
What if you want to insert data into one table based on the contents of
another table? The
INSERT
statement handles that:
INSERT INTO TABLE customer
SELECT * FROM customer_import
The
INSERT
statement supports any valid
SELECT
statement as a source
for the data. (The format for the
SELECT
statement is covered in the next
section.) The data from the
SELECT
statement is appended to the table. If
you replace the
INTO
keyword with
OVERWRITE
, the contents of the table
are replaced.
NOTE
Several variations of these statements can be used with partitioned
tables, as covered in the section “Loading Partitioned Tables,” later in
this chapter.
There is also the option to create managed tables in Hive based on selecting
data from another table:
CREATE TABLE florida_customers AS
SELECT * FROM MsBigData.Customers
WHERE state = 'FL';