Database Reference
In-Depth Information
Impala High Availability
Impala runs on DataNodes and takes advantage of any High Availability ( HA ) con-
figuration available to DataNodes. Impala uses data stored in HDFS, which is the dis-
tributed data storage layer in Hadoop, shared between NameNode and DataNodes.
Hadoop does provide the NameNode High Availability configuration; if you would like
to learn more about it, I would recommend looking at the Hadoop documentation.
To make Impala High Availability, the best option is to take advantage of the HDFS
HA feature. As an Impala cluster administrator, you can upgrade a Hive metastore to
use HDFS HA features. Because Impala depends on Hive metastore, in the event the
primary metastore is not available, it will instantly be available on the other HDFS HA
node without interrupting any significant downtime.
Search WWH ::




Custom Search