Database Reference
In-Depth Information
NOTE
Refer back to Chapter 1 for definitions of Hadoop projects YARN and
Tez.
Phase 2 included the following enhancements:
• Performance: Queries got faster with Stinger phase 2 thanks to a
number of changes. A new logical optimizer was introduced called the
Correlation Optimizer. Its job is to merge multiple correlated
MapReduce jobs into a single job to reduce the movement of data.
ORDER BY was made a parallel operation. Furthermore, predicate
pushdown was implemented to allow ORCFile to skip over rows, much
like segment skipping in SQL Server. Optimizations were also added for
COUNT (DISTINCT) , with the hive.map.groupby.sorted
configuration property.
• SQL compatibility: Two significant data types were introduced:
VARCHAR and DATE . GROUP BY support was enhanced to enable
support for struct and union types. Lateral views were also extended to
support an “outer” join behavior, and truncate was extended to support
truncation of columns. New user-defined functions (UDFs) were added
to work over the Binary data type. Finally partition switching entered
the product courtesy of ALTER TABLE..EXCHANGE PARTITION .
NOTE
SQL Server does not support lateral views. That's because SQL Server
doesn't support a data type for arrays and functions to interact with this
type. To learn about lateral views, head over to
https://cwiki.apache.org/confluence/display/Hive/
LanguageManual+LateralView .
• End of HCatalog project: With Hive 0.12, HCatalog ceased to exist as its
own project and was merged into Hive.
Search WWH ::




Custom Search