Integrating Hadoop - Cassandra: The Definitive Guide

Database Reference

In-Depth Information

ing with solutions for big data problems. Today, Hadoop is widely used, by Yahoo!, Facebook,

LinkedIn, Twitter, IBM, Rackspace, and many other companies. There is a vibrant community

and a growing ecosystem.

Cassandra has built-in support for the Hadoop implementation of MapReduce ( ht-

tp://hadoop.apache.org/mapreduce ) .

Working with MapReduce

This section covers details on how to write a simple MapReduce job over data stored in Cas-

sandra using the Java language. We also briefly cover how to output data into Cassandra and dis-

cuss ongoing progress with using Cassandra with Hadoop Streaming for languages beyond Java.

NOTE

The word count example given in this section is also found in the Cassandra source download in its con-

tribmodule. It can be compiled and run using instructions found there. It is best to run with that code,

as the current version might have minor modifications. However, the principles remain the same.

For convenience, the word count MapReduce example can be run locally against a single Cas-

sandra node. However, for more information on how to configure Cassandra and Hadoop to run

MapReduce in a more distributed fashion, see the section Cluster Coniguration .

Cassandra Hadoop Source Package

Cassandra has a Java source package for Hadoop integration code, called

org.apache.cassandra.hadoop. There we find:

ColumnFamilyInputFormat

The main class we'll use to interact with data stored in Cassandra from Hadoop. It's an ex-

tension of Hadoop's InputFormat abstract class.

ConfigHelper

A helper class to configure Cassandra-specific information such as the server node to point

to, the port, and information specific to your MapReduce job.

ColumnFamilySplit

The extension of Hadoop's InputSplit abstract class that creates splits over our Cassandra

data. It also provides Hadoop with the location of the data, so that it may prefer running tasks

on nodes where the data is stored.

Search WWH ::

Custom Search

Home