Database Reference
In-Depth Information
confusion about which table a column belongs to. The SELECT statement will only output a data row if a matching row
is found on the “myear,” “manufacturer,” and “model” columns of both tables:
SELECT
rd1.*
FROM
fuel.consumption rd1,
fuel.consumption3 rd2
WHERE
rd1.myear = rd2.myear AND
rd1. manufacturer = rd2. manufacturer AND
rd1. model = rd2. model
The INSERT Statement
The INSERT statement allows you to insert a single row into an Impala table, or you can combine INSERT with a SELECT
statement to insert multiple rows. These commands are designed to move large data volumes.
In a first example of this command, the INSERT statement inserts a single row into the fuel database table called
consumption2. The number and type of column values specified in parentheses must match the table definition.
Because the data values are strings, they are shown within single quotes:
INSERT INTO fuel.consumption2 VALUES ('1995','ACURA','INTEGRA','SUBCOMPACT','1.8','4','A4',
'X','10.2','7','28','40','1760','202')
Although this single INSERT statement moves a single row of data into the table fuel.consumption2, a second
example offers better performance by using a SELECT statement to populate the table fuel.consumption2 via a bulk
insert:
INSERT INTO TABLE fuel.consumption2 SELECT * FROM fuel.consumption3
This second version is a simple example; no extra filters or table joins have been used. But the size of the subtable
could be large; with a simple statement, you could copy a large volume of data.
Note: When executing an INSERT statement, you may receive an error message similar to the following:
AnalysisException: Unable to INSERT into target table (fuel.consumption2) because Impala does not
have WRITE access to at least one HDFS path: hdfs://hc1nn/user/hive/warehouse/fuel.db/consumption2
This error occurs if the referenced database and/or table name on HDFS is not owned by the HDFS impala user.
To fix this, you change the ownership with a statement like the following:
[hadoop@hc1r1m1 ~]$ hdfs dfs -chown -R impala:supergroup /user/hive/warehouse/fuel.db/
When run as the hadoop user, this statement recursively changes ownership of the HDFS fuel.db directory to the
impala user and changes the group to a supergroup. Then, you drop the table and re-create it to update the Impala
metadata for the table. After these steps, your INSERT statement should work.
This has been a very short introduction to Cloudera Impala SQL. For full details, check the Impala web site at
http://www.cloudera.com/content/cloudera/en/documentation/cloudera-impala . Additionally, try to create your
own SQL statements by joining these simple building blocks. For instance, you can create SELECT statements with
WHERE clauses. And you can add table joins to your SQL statements and consider adding subqueries.
 
Search WWH ::




Custom Search