Database Reference
In-Depth Information
Type
Description
Examples
SQL Server
Equivalent
precision.
String: JDBC-compliant
timestamp format
YYYY-MM-DD
HH:MM:SS.fffffffff .
A date in YYYY-MM-DD
format.
DATE
{{2012-01-01}}
date
A series of bytes.
BINARY
binary(n)
Defines a column that
contains a defined set of
additional values and their
types.
STRUCT
struct('John',
'Smith')
Defines a collection of key/
value pairs.
MAP
map('first',
'John', 'last',
'Smith')
Defines a sequenced
collection of values.
ARRAY
array('John',
'Smith')
Similar to sql_variant
types. They hold one value
at a time, but it can be any
one of the defined types for
the column.
UNION
Varies depending
on column
sql_variant
The types that are unique to Hive are MAP , ARRAY , and STRUCT . These types
are supported in Hive so that it can better work with the denormalized data
that is often found in Hadoop data stores. Relational database tables are
typically normalized; that is, a row holds only one value for a given column.
In Hadoop, though, it is not uncommon to find data where many values are
stored in a row for a “column.” This denormalization of the data makes it
easier and faster to write the data, but makes it more challenging to retrieve
it in a tabular format.
Hive addresses this with the MAP , ARRAY , and STRUCT types, which let a
developer flatten out the denormalized data into a multicolumn structure.
Search WWH ::




Custom Search