Database Reference
In-Depth Information
2 MapReduce Family
of Large-Scale Data-
Processing Systems
Sherif Sakr, Anna Liu, and Ayman G. Fayoumi
CONTENTS
2.1 Introduction .................................................................................................... 40
2.2 MapReduce Framework: Basic Architecture.................................................. 42
2.3 Extensions and Enhancements of the MapReduce Framework ...................... 45
2.3.1 Processing Join Operations ................................................................. 46
2.3.2 Supporting Iterative Processing .......................................................... 50
2.3.3 Data and Process Sharing ................................................................... 53
2.3.4 Support of Data Indices and Column Storage .................................... 54
2.3.5 Effective Data Placement .................................................................... 59
2.3.6 Pipelining and Streaming Operations ................................................ 59
2.3.7 System Optimizations ......................................................................... 63
2.4 Systems of Declarative Interfaces for the MapReduce Framework ............... 67
2.4.1 Sawzall ................................................................................................ 68
2.4.2 Pig Latin ............................................................................................. 69
2.4.3 Hive ..................................................................................................... 71
2.4.4 Tenzing ............................................................................................... 72
2.4.5 Cheetah ............................................................................................... 74
2.4.6 YSmart ................................................................................................ 74
2.4.7 SQL/MapReduce ................................................................................ 75
2.4.8 HadoopDB .......................................................................................... 76
2.4.9 Jaql ...................................................................................................... 79
2.5 Sample MapReduce-Based Applications ........................................................ 81
2.6 Related Large-Scale Data-Processing Systems .............................................. 86
2.6.1 SCOPE ................................................................................................ 86
2.6.2 Dryad/DryadLINQ ............................................................................. 88
2.6.3 Spark ................................................................................................... 90
2.6.4 Nephle/PACT ...................................................................................... 92
2.6.5 BOOM Analytics ................................................................................ 94
2.6.6 Hyracks/ASTERIX ............................................................................. 96
2.7 Conclusions ..................................................................................................... 99
References .............................................................................................................. 100
39
 
Search WWH ::




Custom Search