Database Reference
In-Depth Information
The details of querying this denormalized data are discussed later in this
chapter. For now, we will review the data types that support these
structures.
A STRUCT is a column that contains multiple defined fields. Each field can
have its own data type. This is comparable to structs in most programming
languages. In Hive, you can declare a STRUCT for a full name using the
following syntax:
STRUCT <FirstName:string, MiddleName:string,
LastName:string>
To access the individual fields of the STRUCT type, use the column name
followed by a period and the name of the field:
FullName.FirstName
An ARRAY is a column that contains an ordered sequence of values. All the
values must be of the same type:
ARRAY<STRING>
Because it is ordered, the individual values can be accessed by their index.
As with Java and .NET languages, ARRAY types use a zero-based index, so
you use an index of 0 to access the first element, and an index of 2 to access
the third element. If the preceding Full Name column were declared as an
ARRAY , with first name in the first position, middle name in the second
position,andlastnameinthethirdposition,youwouldaccessthefirstname
with index 0 and last name with index 2:
FullName[0], FullName[2]
A MAP column is a collection of key/value pairs, where both the key and
values have data types. The key and value do not have to use the same data
type. A MAP for Full Name might be declared using the following syntax:
MAP<string, string>
In the Full Name case, you would populate the MAP column with the
following key/value pairs:
Search WWH ::




Custom Search