Database Reference
In-Depth Information
NOTE
Although Hive looks like a relational database, with tables, columns,
indexes, and so on, and much of the terminology is the same, it is not a
relational database. Hive does not enable referential integrity, it does
not enable transactions, and it does not grant ACID (atomicity,
consistency, isolation, and durability) properties to Hadoop data stores.
Providing Structure for Unstructured Data
Users and the tools they use for querying data warehouses generally expect
tabular, well-structured data. They expect the data to be delivered in a row/
column format, and they expect consistency in the data values returned.
Take the example of a user requesting a data set containing all the sales
transactions for yesterday. Imagine the user's reaction if some rows in the
data set contained 10 columns, some contained 8 columns, and some
contained 15. The user would also be very surprised to find that the unit cost
column in the data set contained valid numeric values on some rows, and on
others it might contain alpha characters.
Because Hadoop data stores don't enforce a particular schema, this is a
very real scenario when querying Hadoop. Hive helps with this scenario
by enabling you to specify a schema of columns and their types for the
information. Then, when the data is queried through Hive, it ensures that
the results conform to the expected schema.
These schemas are declared by creating a table. The actual table data is
stored as files in the Hadoop file system. When you request data from the
table, Hive translates that request to read the appropriate files from the
Hadoop file system and returns the data in a format that matches the table
definition provided.
The table definitions are stored in the Hive metadata store, or metastore . By
default, the metastore is an embedded Derby database. This metastore is a
relational database that captures the table metadata (the name of the table,
the columns and data types it contains, and the format that the underlying
files are expected to be in).
Search WWH ::




Custom Search