Database Reference
In-Depth Information
Chapter 4. Impala Walkthrough with an
Example
In this chapter, we will go over a use case to see Impala concepts in action. This
way you can experience a real-world scenario using Impala, and understand how and
where to use Impala statements in real-world applications. In this chapter, I will be us-
ing a scenario as described in the following sections.
Creating an example scenario
We are going to deal with information related to automobiles. We have two data files
that contain information about automobiles and motorcycles in two separate text files.
The following conceptual image shows that within the Autos database, there are two
tables named Motorcycles and Automobiles .
So far, it is imprinted on your mind that Impala is running on DataNode, and the files in
our project are stored on HDFS. First we will load these files from HDFS to Impala and
then we will use SQL statements to process this information through multiple queries.
Example
dataset
one
-
automobiles
(automobiles.txt)
Let's take a look at this example dataset, which has a list of automobile names and
their properties as defined in the schema. The following is the first text file, which has
automobile-specific data:
File : automobiles.txt
Schema : make , model , year , fuel-type , numOfDoors , design , type ,
cylinders , horsepower , city_hwy_mpg , price
Here is the data in the automobiles.txt file:
Search WWH ::




Custom Search