Database Reference
In-Depth Information
the password for that account, plus I know that the Hive version that I am using for Talend is version 2. That means
that the only property I need to determine is the metastore port number. Given that I know all logs will be stored
under /var/log for Cloudera CDH5 servers, I obtain that information as follows:
[hadoop@hc2nn hive]$ pwd
/var/log/hive
[hadoop@hc2nn hive]$ ls -l
total 3828
drwx------ 2 hive hive 4096 Aug 31 12:14 audit
-rw-r--r-- 1 hive hive 2116446 Nov 8 09:58 hadoop-cmf-hive-HIVEMETASTORE-hc2nn.semtech-solutions.
co.nz.log.out
-rw-r--r-- 1 hive hive 1788700 Nov 8 09:58 hadoop-cmf-hive-HIVESERVER2-hc2nn.semtech-solutions.
co.nz.log.out
[hadoop@hc2nn hive]$ grep ThriftCLIService hadoop-cmf-hive-HIVESERVER2-*.log.out | grep listen |
tail -2
2014-11-08 09:49:47,269 INFO org.apache.hive.service.cli.thrift.ThriftCLIService:
ThriftBinaryCLIService listening on 0.0.0.0/0.0.0.0:10000
2014-11-08 09:58:58,608 INFO org.apache.hive.service.cli.thrift.ThriftCLIService:
ThriftBinaryCLIService listening on 0.0.0.0/0.0.0.0:10000
The first command shows, via a Linux pwd (print working directory) command, that I am in the directory /var/
log/hive. (Note: use the cd command to move to that directory, if necessary.) Then, using the Linux ls command
with the -l option to provide a long listing, I check to see which log files exist in this Hive log directory. Finally, I use
the Linux grep command to search the HIVESERVER2-based log file for the string ThriftCLIService . I pipe ( | ) the
output of this search to another grep command, which searches the ouput further for lines that also contain the text
“listen.” Finally, I limit the output to the last two lines via the Linux tail command with a parameter of -2 . The output
contains the port number that I need at the end of the line. Then, 10000 is the default port number that will be used in
the Talend Hive connection for this section.
So, now I am ready to create a Hive database connection. I can do this by right-clicking the DB Connections
option in the Repository pane. Then, I select Create DB Connection to open a form that offers a two-step process for
creating the connection.
The first section requests the name, purpose, description, and status of the connection. Take care to make the
name meaningful. The second step (shown in Figure 11-19 ) gives the actual connection details. That is, the database
type is set to Hive and the server/port are defined as hc2nn/10000, as previously determined. The Linux account
login for the CentOS host hc2nn is set to hadoop, along with its password. The Hive version is set to Hive2, while
the Hadoop version and instance are set to match the Hadoop cluster being used, Cloudera/CDH5. Finally, the
jdbc string, the Java-based method that Talend will use to connect to Hive, is set to a connection string that uses the
hostname, port, and Hive version.
 
Search WWH ::




Custom Search