Database Reference
In-Depth Information
File
formats
and
compression
types
supported in Impala
Hadoop is used as a data storage system where all kinds of data is stored in various
file formats. To reduce the disk space requirement, the data is stored in a compressed
format so various compression types are used with different kinds of file formats. Vari-
ous file formats and compression types create a collection of file formats and com-
pression combinations for any application to support.
Impala does a great job in supporting most of the popular file formats and compres-
sion types, as listed in the following table:
File type
Compression type
CREATE support INSERT support
Text
LZO
Yes
Yes
Avro
GZIP, BZIP2, deflate, Snappy No (use Hive)
No
RCFile
GZIP, BZIP2, deflate, Snappy Yes
No
SequenceFile GZIP, BZIP2, deflate, Snappy Yes
No
Parquet
GZIP, Snappy (default)
Yes
Yes
The preceding table also describes if Impala can use the CREATE or INSERT com-
mand with specific file and compression types. For example, with the Parquet file type
and the Snappy or GZIP compression type, Impala can create tables as well as in-
Search WWH ::




Custom Search