Database Reference
In-Depth Information
From building_stg bs
Insert Overwrite Table building
select bs.building_id, bs.building_age, bs.hvac_type;
14. Now that the tables are loaded, you need to build the indexes:
ALTER INDEX Building_IDX_1 ON sensor REBUILD;
ALTER INDEX Building_IDX_2 ON building REBUILD;
After the index has been built, the data is now ready to analyze.
You can use Hive to query and aggregate the data. The following query
determines the maximum temperature difference between target and
actual temperatures for each day and HVAC type:
select max(s.delta_tmp), s.dt, b.hvac_type
from sensor s join building b on (s.building_id =
b.building_id)
Group By s.dt, b.hvac_type;
In Chapter 11, “Visualizing Big Data with Microsoft BI,” youwill seehowyou
canusetheHiveODBCconnectortoloadandanalyzethedatainMicrosoft's
BI toolset.
Using HBase or Hive as a Data Warehouse
Although both HBase and Hive are both considered data warehouse
structures, they differ significantly as to how they store and query data. Hive
is more like a traditional data warehouse reporting system. It structures the
data in a set of tables that you can join, aggregate, and query on using a
query language (Hive Query Language [HQL]) that is very similar to the
SQL,whichmostdatabasedevelopersarealreadyusedtoworkingwith.This
relieves you from having to write MapReduce code. The downside to Hive
is it can take a long time to process through the data and is not intended to
give clients instant results.
Hive is usually used to run processing through scheduled jobs and then
load the results into a summary type table that can be queried on by client
applications. One of the strengths of Hive and HCatalog is the ability to pass
data between traditional relational databases such as SQL Server. A good
use case for Hive and HCatalog is to load large amounts of unstructured
Search WWH ::




Custom Search