Database Reference
In-Depth Information
Loading
data
from
HDFS
using
the
LOAD DATA statement
As you know, data is stored in HDFS and Impala processes this data. So, when you
need to perform some
Extract Transform Load
(
ETL
) activity to load the data from
HDFS to Impala tables, you can use
LOAD DATA
statements. The key properties of
LOAD DATA
statements are as follows:
• The loaded data files are moved from HDFS to the Impala data directory
• You can either give a file name from HDFS or a directory name to load all the
files into an Impala table; however, a wild card pattern is not supported with
the HDFS path
The
LOAD DATA
statement and examples are as follows:
LOAD DATA INPATH 'hdfs_file_or_directory_path'
[OVERWRITE]
INTO TABLE tablename
[PARTITION (partcol1=val1,
partcol2=val2 ...)]
Examples:
CREATE TABLE students (id int, name string);
LOAD DATA INPATH '/user/avkash/students.txt'
INTO TABLE students;
In the previous example, you have to make sure that the
students.txt
file is loc-
ated at HDFS in
folder /user/avkash
.