Distributed Programming for the Cloud - Large Scale and Big Data: Processing and Management

Database Reference

In-Depth Information

RCFile

Relation

ABCD

Row group

HDFS

block

101

102

103

104

105

111

112

113

114

115

121

122

123

124

105

131

132

133

134

135

16 bytes

sync

101,

111,

121,

131,

Metadata

header

Row group 1

102,

112,

122,

132,

103,

113,

123,

133,

104,

114,

124,

134,

105

115

125

135

Row group 2

Row group n

FIGURE 2.8 An example structure of RCFile . (From Y. He et al., RCFile: A fast and

space-efficient data placement structure in MapReduce-based warehouse systems, in ICDE ,

pp. 1199-1208, 2011.)

during query execution. In particular, the metadata header section is compressed

using the RLE (Run Length Encoding) algorithm. The table data section is not com-

pressed as a whole unit. However, each column is independently compressed with the

Gzip compression algorithm. When processing a row group, RCFile does not need to

fully read the whole content of the row group into memory. It only reads the meta-

data header and the needed columns in the row group for a given query, and thus, it

can skip unnecessary columns and gain the I/O advantages of a column-store. The

metadata header is always decompressed and held in memory until RCFile processes

the next row group. However, RCFile does not decompress all the loaded columns

and uses a lazy decompression technique where a column will not be decompressed

in memory until RCFile has determined that the data in the column will be really

useful for query execution.

The notion of Trojan data layout has been coined in [78], which exploits the exist-

ing data block replication in HDFS to create different Trojan layouts on a per-replica

basis. This means that rather than keeping all data block replicas in the same layout,

it uses different Trojan layouts for each replica, which is optimized for a different

subclass of queries. As a result, every incoming query can be scheduled to the most

suitable data block replica. In particular, Trojan layouts change the internal organi-

zation of a data block and not among data blocks. They co-locate attributes together

according to query workloads by applying a column grouping algorithm that uses

an interestingness measure that denotes how well a set of attributes speeds up most

or all queries in a workload. The column groups are then packed to maximize the

total interestingness of data blocks. At query time, an incoming MapReduce job is

transparently adapted to query the data block replica that minimizes the data access

time. The map tasks are then routed of the MapReduce job to the data nodes storing

such data block replicas.

Large Scale and Big Data: Processing and Management

Search WWH ::

Custom Search

Home