Database Reference
In-Depth Information
STORED AS
INPUTFORMAT
'org.apache.hadoop.hive.ql.io.RCFileInputFormat'
OUTPUTFORMAT
'org.apache.hadoop.hive.ql.io.RCFileOutputFormat'
The row format options are controlled by the ROW FORMAT portion. The
delimited SerDe is the default. To specify a custom SerDe, use the SERDE
keyword followed by the class name of the SerDe. For example, the
RegexSerDe can be specified as follows:
ROW FORMAT SERDE
'org.apache.hadoop.hive.contrib.serde2.RegexSerDe'
Another important option in table creation is the EXTERNAL option. By
default, when you create a table without specifying EXTERNAL , it is created
asamanaged table.ThismeansthatHiveconsidersitselfthemanager ofthe
table, including any data created in it. The data for the table will be stored
in a subdirectory under the database folder, and if the table is dropped, Hive
will remove all the data associated with the table.
However, if you use CREATE EXTERNAL TABLE to create the table, Hive
creates the metadata for the table, and allows you to query it, but it doesn't
consider itself the owner of the table. If the table is dropped, the metadata
for it will be deleted, but the data will be left intact. External tables are
particularly useful for data files that are shared among multiple
applications. Creating the Hive table definition allows it to be queried using
the power of Hive, but it makes it clear that the data is shared with other
applications.
When you use the EXTERNAL keyword, you must also use the LOCATION
option:
CREATE EXTERNAL TABLE MsBigData.customer (
name STRING,
city STRING,
state STRING,
postalCode STRING,
purchases MAP<STRING, DECIMAL>
Search WWH ::




Custom Search