Database Reference
In-Depth Information
Designing, Building, and Loading Tables
If you are familiar with basic T-SQL data definition language (DDL)
commands, you already have a good head start in working with Hive tables.
To declare a Hive table, a CREATE statement is issued similar to those used
to create tables in a SQL Server database. The following example creates a
simple table using primitive types that are commonly found elsewhere:
CREATE EXTERNAL TABLE iislog (
date STRING,
time STRING,
username STRING,
ip STRING,
port INT,
method STRING,
uristem STRING,
uriquery STRING,
timetaken INT,
useragent STRING,
referrer STRING
)
ROW FORMAT DELIMITED FIELDS TERMINATED BY ',';
Two important distinctions need to be pointed out with regard to the
preceding example. First, note the EXTERNAL keyword. This keyword tells
Hive that for this table it only owns the table metadata and not the
underlying data. The opposite of this keyword (and the default value) is
INTERNAL , which gives Hive control of both the metadata and the
underlying data.
The difference between these two options is most evident when the table is
dropped using the DROP TABLE command. Because Hive does not own the
data for an EXTERNAL table, only the metadata is removed, and the data
continues to live on. For an INTERNAL table, both the table metadata and
data are deleted.
The second distinction in the CREATE statement is found on the final line of
thecommand: ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' .
This command instructs Hive to read the underlying data file and split the
columns or fields using a comma delimiter. This is indicative that instead
Search WWH ::




Custom Search