Database Reference
In-Depth Information
'FirstName', 'John'
'MiddleName', 'Doe'
'LastName', 'Smith'
You can access MAP column elements using the same syntax as you use with
an ARRAY , except that you use the key value instead of the position as the
index. Accessing the first and last names would be done with this syntax:
FullName['FirstName'], FullName['LastName']
After looking at the possible data types, you may be wondering how these
are stored in Hive. The next section covers the file formats that can be used
to store the data.
File Formats
Hive uses Hadoop as the underlying data store. Because the actual data is
stored in Hadoop, it can be in a wide variety of formats. As discussed in
Chapter 5, “Storing and Managing Data in HDFS,” Hadoop stores files and
doesn't impose any restrictions in the content or format of those files. Hive
offers enough flexibility that you can work with almost any file format, but
some formats require significantly more effort.
The simplest files to work with in Hive are text files, and this is the default
format Hive expects for files. These text files are normally delimited by
specific characters. Common formats in business settings are
comma-separated value files or tab-separated value files. However, the
drawback of these formats is that commas and tabs often appear in real
data; that is, they are embedded inside other text, and not intended as
delimiters in all instances. For that reason, Hive by default uses control
characters as delimiters, which are less likely to appear in real data. Table
6.2 describes these default delimiters.
 
Search WWH ::




Custom Search