Database Reference
In-Depth Information
Business Intelligence (Excel, Powerview...)
Data Access Layer (ODBC/SQOOP/REST)
Stats Processing
(RHadoop)
Metadata (HCatalog)
Graph (Pegasus)
Machine Learning
(Mahout)
Scripting (Pig)
Query (Hive)
Distributed Processing
(Map Reduce)
Distributed Storage (HDFS)
Figure 8-1. The Hadoop ecosystem
Programming MapReduce jobs can be tedious, and they require their own development, testing, and
maintenance investments. Hive lets you democratize access to Big Data using familiar tools such as Excel and a
SQL-like language without having to write complex MapReduce jobs. Hive queries are broken down into MapReduce
jobs under the hood, and they remain a complete abstraction to the user. The simplicity and SQL-ness of Hive queries
has made Hive a popular and preferred choice for users. That is particularly so for users with traditional SQL skills,
because the ramp-up time is so much less than what is required to learn how to program MapReduce jobs directly.
Figure 8-2 gives an overview of the Hive architecture.
 
Search WWH ::




Custom Search