Database Reference
In-Depth Information
Tip
Please visit the Cloudera Impala documentation for SequenceFile file format sup-
port at the following URL:
http://www.cloudera.com/content/cloudera-content/cloudera-docs/Impala/latest/
Installing-and-Using-Impala/ciiu_seqfile.html
The Parquet file format with Impala tables
You might be wondering what the Parquet file format is. I would like to provide a little
information in this context. The Parquet file format is a column-oriented binary file
format that is designed to provide column-specific access to the data. As the data
is stored in columns and all columns are stored separately, lookups are happening
on columns first. This column-oriented access method makes query processing very
fast and efficient, and Impala takes advantage of this file format. Impala provides
native support to create, manage, and query tables based on the Parquet file format.
The following is the syntax for creating a table that can store the Parquet file format
in Impala:
CREATE TABLE my_parquet_table (userID int,
userName string)
STORED AS PARQUETFILE;
As Impala supports writing the Parquet file format within Impala, you can use the
INSERT statement as shown in the following code snippet to write to your Parquet
file type from other files:
INSERT OVERWRITE TABLE my_parquet_table
SELECT * FROM other_table_name;
Search WWH ::




Custom Search