Database Reference
In-Depth Information
Execution engines
Hive was originally written to use MapReduce as its execution engine, and that is still the
default. It is now also possible to run Hive using Apache Tez as its execution engine, and
work is underway to support Spark (see Chapter 19 ), too. Both Tez and Spark are general
directed acyclic graph (DAG) engines that offer more flexibility and higher performance
than MapReduce. For example, unlike MapReduce, where intermediate job output is ma-
terialized to HDFS, Tez and Spark can avoid replication overhead by writing the interme-
diate output to local disk, or even store it in memory (at the request of the Hive planner).
The execution engine is controlled by the hive.execution.engine property, which
defaults to mr (for MapReduce). It's easy to switch the execution engine on a per-query
basis, so you can see the effect of a different engine on a particular query. Set Hive to use
Tez as follows:
hive> SET hive.execution.engine=tez;
Note that Tez needs to be installed on the Hadoop cluster first; see the Hive documenta-
tion for up-to-date details on how to do this.
Logging
You can find Hive's error log on the local filesystem at ${java.io.tmpdir}/${user.name}/
hive.log . It can be very useful when trying to diagnose configuration problems or other
types of error. Hadoop's MapReduce task logs are also a useful resource for troubleshoot-
ing; see Hadoop Logs for where to find them.
On many systems, ${java.io.tmpdir} is /tmp , but if it's not, or if you want to set
the logging directory to be another location, then use the following:
% hive -hiveconf hive.log.dir='/tmp/${user.name}'
The logging configuration is in conf/hive-log4j.properties , and you can edit this file to
change log levels and other logging-related settings. However, often it's more convenient
to set logging configuration for the session. For example, the following handy invocation
will send debug messages to the console:
% hive -hiveconf hive.root.logger=DEBUG,console
Hive Services
The Hive shell is only one of several services that you can run using the hive command.
You can specify the service to run using the --service option. Type hive --ser-
Search WWH ::




Custom Search