Hive - Hadoop: The Definitive Guide

Database Reference

In-Depth Information

The Hive Shell

The shell is the primary way that we will interact with Hive, by issuing commands in

HiveQL . HiveQL is Hive's query language, a dialect of SQL. It is heavily influenced by

MySQL, so if you are familiar with MySQL, you should feel at home using Hive.

When starting Hive for the first time, we can check that it is working by listing its tables

— there should be none. The command must be terminated with a semicolon to tell Hive

to execute it:

hive> SHOW TABLES;

OK

Time taken: 0.473 seconds

Like SQL, HiveQL is generally case insensitive (except for string comparisons), so show

tables; works equally well here. The Tab key will autocomplete Hive keywords and

functions.

For a fresh install, the command takes a few seconds to run as it lazily creates the

metastore database on your machine. (The database stores its files in a directory called

metastore_db , which is relative to the location from which you ran the hive command.)

You can also run the Hive shell in noninteractive mode. The -f option runs the com-

mands in the specified file, which is script.q in this example:

% hive -f script.q

For short scripts, you can use the -e option to specify the commands inline, in which case

the final semicolon is not required:

% hive -e 'SELECT * FROM dummy'

OK

X

Time taken: 1.22 seconds, Fetched: 1 row(s)

NOTE

It's useful to have a small table of data to test queries against, such as trying out functions in SELECT

expressions using literal data (see Operators and Functions ). Here's one way of populating a single-row

table:

% echo 'X' > /tmp/dummy.txt

% hive -e "CREATE TABLE dummy (value STRING); \

LOAD DATA LOCAL INPATH '/tmp/dummy.txt' \

OVERWRITE INTO TABLE dummy"

Search WWH ::

Custom Search

Home