Database Reference
In-Depth Information
ConfigHelper
The
ConfigHelper
class is a gateway to configure Cassandra-specific settings for Ha-
doop. It is a pretty plain utility class that validates the settings passed and sets into Ha-
doop's
org.apache.hadoop.conf.Configuration
instance for the job. This con-
figuration is made available to the Mapper and the Reducer.
The
ConfigHelper
class saves developers from inputing the wrong property name be-
cause all the properties are set using a method; any typo can appear at compile time. It may
be worth looking at JavaDoc for
ConfigHelper
. Here are some of the commonly used
methods:
•
setInputInitialAddress
: This can be a hostname or private IP of one of
the Cassandra nodes.
•
SetInputRpcPort
: This will set the RPC port address if it has been altered
from default. If not set, it uses the default thrift port
9160
.
•
setInputPartitioner
: This will set the appropriate partitioner according to
the underlying Cassandra storage setting.
•
SetInputColumnFamily
: This will set the column family details to be able to
pull data from.
•
SetInputSlicePredicate
: This will set the columns that are pulled from
column family to provide Mapper to work on.
•
SetOutputInitialAddress
: This will set the address of Cassandra cluster
(one of the nodes) where the result is being published; it is usually similar to
In-
putInitialAddress
.
•
SetOutputRpcPort
: This will set the RPC port to cluster where the result is
stored.
•
SetOutputPartitioner
: This is the partitioner used in the output cluster.
•
SetOutputColumnFamily
: This will set the column family details to store
results in.
Since version 1.1, Cassandra added support to wide row column families, bulk loading, and
secondary indexes.