Adding Structure with Hive - Microsoft Big Data Solutions

Database Reference

In-Depth Information

NOTE

In a default Hive setup, the Derby database used for the metastore may

be configured for single-user access. If you are just testing Hive or

running a local instance for development, this may be fine. However,

for Hive implementation in a production environment, you will want to

upgrade the metastore to a multiple-user setup using a more robust

database. One of the more common databases used for this is MySQL.

However, the metastore can be any Java Database Connectivity

(JDBC)-compliant database. If you are using the Hortonworks' HDP 1.3

Windows distribution, SQL Server can be used as a supported

metastore.

Hive v0.11 also includes HiveServer2. This version of Hive improves

support for multi-user concurrency and supports additional

authentication methods, while providing the same experience as the

standard Hive server. Again, for a production environment,

HiveServer2 may be a better fit. The examples used in this chapter run

against Hive Server and HiveServer2.

Another area of difference between Hive and many relational databases is

itssupportfordifferentdatatypes.Duetotheunstructureddatathatitmust

support, itdefines anumber ofdata typesthatyouwon'tfind inatraditional

relational database.

Hive Data Types

Table 6.1 lists the data types supported by Hive. Many of these data types

have equivalent values in SQL Server, but a few are unique to Hive. Even for

the data types that appear familiar, it is important to remember that Hive

is coded as a Java application, and so these data types are implemented in

Java. Their behavior will match the behavior from a Java application that

uses the same data type. One immediate difference you will notice is that

STRING types do not have a defined length. This is normal for Java and

other programming languages, but is not typical for relational databases.

Search WWH ::

Custom Search

Home