Hadoop ecosystem provides Pig and Hive frameworks to query data from HDFS. In
ing interface for HD) is being released. We will not be covering HAWQ in this topic.
how HDFS data can be queried using some examples.
In this section, we will focus on understanding how to use Hive to access data stored
in HDFS. The following figure depicts Hive architecture.
Hive has the following dependencies to run successfully:
• Java 6
• Hadoop framework and Hadoop home directory configured
Hive internally runs in a MapReduce mode for efficiency. Hive is an SQL-like inter-
face that can query data on HDFS.
1. Passing CSV data onto HDFS using the following commands: