Database Reference
In-Depth Information
Next, I install a repository file under /etc/yum.repos.d on hc1r1m1 for Impala, so that the Linux yum command
knows where to find the Cloudera Impala software. The repository file is downloaded from Cloudera's site by using
the Linux wget command:
[root@hc1r1m1 ~]# cd /etc/yum.repos.d
[root@hc1r1m1 ~]# wget http://archive.cloudera.com/impala/redhat/6/x86_64/impala/cloudera-impala.repo
I can examine the contents of this downloaded repository file by using the Linux cat command:
[root@hc1r1m1 yum.repos.d]# cat cloudera-impala.repo
[cloudera-impala]
name=Impala
baseurl= http://archive.cloudera.com/impala/redhat/6/x86_64/impala/1/
gpgkey = http://archive.cloudera.com/impala/redhat/6/x86_64/impala/RPM-GPG-KEY-cloudera
gpgcheck = 1
Next, I install the Impala components and the Impala shell by using the yum command as the Linux root user:
[root@hc1r1m1 ~]# yum install impala impala-server impala-state-store impala-catalog impala-shell
These commands install the Impala Catalogue server, the Impala server, the Impala State Store server, and the
Impala scripting shell. The Impala server runs on each node in an Impala cluster; it accepts queries and passes data to
and from the files. The Impala scripting shell acts as a client to receive user commands and passes them to the server.
Key to making an Impala cluster robust, the State Store server monitors the state of an Impala cluster and manages the
workload when something goes wrong. The Catalog server manages metadata—that is, data about data—and passes
details about metadata changes to the rest of the cluster.
As soon as the software is installed, it is time to configure it. I copy the Hive hive-site.xml, the HBase hbase-site.xml,
and the Hadoop files core-site.xml and hdfs-site.xml to the Impala configuration area, which I find under/etc/impala/conf.
The dot character ( . ) at the end of the cp (copy)command is just Linux shorthand for the current directory:
[root@hc1r1m1 conf]# cd /etc/impala/conf
[root@hc1r1m1 conf]# cp /etc/hive/conf/hive-site.xml .
[root@hc1r1m1 conf]# cp /etc/hadoop/conf/core-site.xml .
[root@hc1r1m1 conf]# cp /etc/hbase/conf/hbase-site.xml .
[root@hc1r1m1 conf]# cp /etc/hadoop/conf/hdfs-site.xml .
To specify the host and port number for the Hive metastore thrift API, as well as to specify a timeout value for
access, I make the following changes to the hive-site.xml file in the Impala configuration area:
<!-- impala changes -->
<property>
<name>hive.metastore.uris</name>
<value>thrift://hc1r1m1:9083</value>
<description>
IP address (or fully-qualified domain name) and port of the metastore host
</description>
</property>
 
Search WWH ::




Custom Search