Hive - Hadoop: The Definitive Guide

Database Reference

In-Depth Information

Hive has not stood still, though, and since Impala was launched, the “Stinger” initiative by

Hortonworks has improved the performance of Hive through support for Tez as an execu-

tion engine, and the addition of a vectorized query engine among other improvements.

Other prominent open source Hive alternatives include Presto from Facebook , Apache

Drill , and Spark SQL . Presto and Drill have similar architectures to Impala, although Drill

targets SQL:2011 rather than HiveQL. Spark SQL uses Spark as its underlying engine,

and lets you embed SQL queries in Spark programs.

NOTE

Spark SQL is different to using the Spark execution engine from within Hive (“Hive on Spark,” see Exe-

cution engines ). Hive, on Spark provides all the features of Hive since it is a part of the Hive project.

Spark SQL, on the other hand, is a new SQL engine that offers some level of Hive compatibility.

Apache Phoenix takes a different approach entirely: it provides SQL on HBase. SQL ac-

cess is through a JDBC driver that turns queries into HBase scans and takes advantage of

HBase coprocessors to perform server-side aggregation. Metadata is stored in HBase, too.

Search WWH ::

Custom Search

Home