Database Reference
In-Depth Information
State : Completed
SubmissionTime : 11/24/2013 7:35:18 AM
Cluster : democluster
PercentComplete :
JobId : job_201311240635_0006
Logging initialized using configuration in file:/C:/apps/dist/hive-0.11.0.1.3.1.0-06/
conf/hive-log4j.properties
Loading data to table default.stock_analysis partition (exchange=NASDAQ)
OK
Time taken: 44.327 seconds
Repeat the preceding steps for all the .csv files you have to load into the table. Note that you need to replace
only the
.csv
file names in
$querystring
and make sure you load the data into the respective partitions of the
Hive table.
Listing 8-7 gives you all the
LOAD
commands for each of the .csv files.
Listing 8-7.
The LOAD commands
$querystring = "load data inpath 'wasb://democlustercontainer@democluster.blob.core.windows.net/
debarchan/StockData/tableFacebook.csv'
into table stock_analysis partition(exchange ='NASDAQ');"
$querystring = "load data inpath 'wasb://democlustercontainer@democluster.blob.core.windows.net/
debarchan/StockData/tableApple.csv'
into table stock_analysis partition(exchange ='NASDAQ');"
$querystring = "load data inpath 'wasb://democlustercontainer@democluster.blob.core.windows.net/
debarchan/StockData/tableGoogle.csv'
into table stock_analysis partition(exchange ='NASDAQ');"
$querystring = "load data inpath 'wasb://democlustercontainer@democluster.blob.core.windows.net/
debarchan/StockData/tableIBM.csv'
into table stock_analysis partition(exchange ='NYSE');"
$querystring = "load data inpath 'wasb://democlustercontainer@democluster.blob.core.windows.net/
debarchan/StockData/tableOracle.csv'
into table stock_analysis partition(exchange ='NYSE');"
Querying Tables with HiveQL
After you create tables and load data files into the appropriate locations, you can start to query the data by executing
HiveQL
SELECT
statements against the tables. As with all data processing on HDInsight, HiveQL queries are implicitly
executed as MapReduce jobs to generate the required results. HiveQL
SELECT
statements are similar to SQL queries,
and they support common operations such as
JOIN
,
UNION
, and
GROUP BY
.
For example, you can use the code in Listing 8-8 to filter by
stock_symbol
column and also to return 10 rows for
sampling, because you don't know how many rows you may have.