Databases Reference
In-Depth Information
If you need additional information on physical fi les include EXTENDED between EXPLAIN and the query.
Next, a simple use case of data partitioning is shown.
Partitioned Table
Partitioning a table enables you to segregate data into multiple namespaces and fi lter and query the
data set based on the namespace identifi ers. Say a data analyst believed that ratings were impacted
when the user submitted them and wanted to split the ratings into two partitions, one for all ratings
submitted between 8 p.m. and 8 a.m. and the other for the rest of the day. You could create a virtual
column to identify this partition and save the data as such.
Then you would be able to fi lter, search, and cluster on the basis of these namespaces.
SUMMARY
This chapter tersely depicted the power and fl exibility of Hive. It showed how the old goodness of
SQL can be combined with the power of Hadoop to deliver a compelling data analysis tool, one that
both traditional RDBMS developers and new big data pioneers can use.
Hive was built at Facebook and was open sourced as a subproject of Hadoop. Now a top-level
project, Hive continues to evolve rapidly, bridging the gap between the SQL and the NoSQL worlds.
Prior to Hive's release as open source, Hadoop was arguably useful only to a subset of developers in
any given group needing to access “big data” in their organization. Some say Hive nullifi es the use
of the buzzword, NoSQL, the topic of this topic. It almost makes some forcefully claim that NoSQL
is actually an acronym that expands out to Not Only SQL.
Search WWH ::




Custom Search