Database Reference
In-Depth Information
INSERT [INTO | OVERWRITE] TABLE table_name
SELECT ….
Each
INSERT
statement creates new data files in HDFS with unique names, and
this way multiple
INSERT INTO
statements can be executed simultaneously. It is
possible that
INSERT
commands were executed on a different Impala daemon (
im-
palad
); then, using the
REFRESH table_name
command on other nodes will help
in syncing the data into a single table effectively. In general, the
INSERT
statement
is very detailed, and to learn the various functions that come with it, my suggestion
would be to look at the SQL statements documentation for
INSERT
.
The SELECT statement
The
SELECT
statement is used to select data from a table, which is part of the data-
base currently in use. To use the database, you start the
USE
statement first and then
use the
SELECT
statement. Here are some features of the
SELECT
clause in Impala:
• The
DISCTINCT
clause can also be used but it is applied per query
• The
SELECT
clause also uses the
WHERE
,
GROUP BY
, and
HAVING
clauses
• You can also use
LIMIT
while using
ORDER BY
with
SELECT
I have written several examples by using
SELECT
with other clauses in this chapter,
so here is the syntax of using the
SELECT
clause in SQL statements for reference
purposes:
SELECT column_name,column_name FROM table_name;
SELECT * FROM table_name;
Internal and external tables
It is good that we can have a little discussion on internal and external tables while
learning about table-specific statements. When using
CREATE TABLE
, the newly
created table is considered as the internal table, whereas while using the
CREATE
EXTERNAL TABLE
statement the tables created are considered as external tables.
The properties of internal tables in Impala are as follows: