Database Reference
In-Depth Information
available, you might receive the following message in Impala logs, indicating
native checksumming is not enabled:
"Unable to load native-hadoop library for
your platform... using built-in-java
classes where applicable"
Enabling Impala to perform short-circuit read on
DataNode
Short-circuit read means reading data locally from the filesystem instead of commu-
nicating first with DataNode, and it definitely improves performance. You must have
Cloudera CDH 4.2 or higher to achieve faster and compatible short-circuit reading.
The following guideline is provided based on the assumption that you have Cloudera
CDH 4.2 or higher installed:
1. Modify hdfs-site.xml on each Impala node as follows:
<property>
<name>dfs.client.read.shortcircuit</name>
<value>true</value>
</property>
<property>
<name>dfs.domain.socket.path</name>
<value>/var/run/hadoop-hdfs/
dn._PORT</value>
</property>
<property>
<name>dfs.client.file-block-storage-locations.timeout</name>
<value>3000</value>
</property>
2. Make sure that /var/run/hadoop-hdfs/ is group writable for root users.
3. Copy hdfs-site.xml and core-site.xml from the Hadoop configura-
tion to each Impala node configuration at /etc/impala/conf .
Search WWH ::




Custom Search