Database Reference
In-Depth Information
Figure 9-2. Impala scripting via Hue user interface
Uses of Impala
The real power of Impala lies not in its interfaces but in its query language and the options that it provides. For
example, it can access HDFS-based data via external tables, and it offers standard SQL-based operations, such as
filters, table joins, subqueries, inserts, and more. These terms are described next, in a step-by-step manner, and then
their corresponding equivalents are examined for Hive. The database itself is highly scalable and robust, as it is built
on top of HDFS.
At this point I need some data to process, so as to demonstrate Impala's SQL-based functionality. In Chapter 5,
I uploaded a series of fuel consumption CSV data files to HDFS, under the HDFS directory /user/hue2/fuel_
consumption/; the Hadoop file system ls command that follows shows that upload:
[hadoop@hc1r1m1 ~]$ hdfs dfs -ls /user/hue2/fuel_consumption
Found 16 items
-rw-r--r-- 2 hadoop hue2 248956 2014-09-07 18:17 /user/hue2/fuel_consumption/MY1995-1999 Fuel
Consumption Ratings.csv
-rw-r--r-- 2 hadoop hue2 45203 2014-09-07 18:17 /user/hue2/fuel_consumption/MY2000 Fuel
Consumption Ratings.csv
.......................
Consumption Ratings.csv
-rw-r--r-- 2 hadoop hue2 77452 2014-09-07 18:17 /user/hue2/fuel_consumption/MY2013 Fuel
Consumption Ratings.csv
-rw-r--r-- 2 hadoop hue2 77186 2014-09-07 18:17 /user/hue2/fuel_consumption/MY2014 Fuel
Consumption Ratings.csv
 
Search WWH ::




Custom Search