Database Reference
In-Depth Information
Figure 15-2. Database tables are typically physically represented as an array of rows, with all the
columns in a row stored adjacent to one another
Figure 15-3. Large objects are usually held in a separate area of storage; the main row storage con-
tains indirect references to the large objects
The difficulty of working with large objects in a database suggests that a system such as
Hadoop, which is much better suited to storing and processing large, complex data ob-
jects, is an ideal repository for such information. Sqoop can extract large objects from
tables and store them in HDFS for further processing.
As in a database, MapReduce typically materializes every record before passing it along
to the mapper. If individual records are truly large, this can be very inefficient.
Search WWH ::




Custom Search