Database Reference
In-Depth Information
Figure 5-5 . Hadoop Web UI admin console
Alternatively, you can run the following command to validate the stored tweet file:
$HADOOP_HOME/bin/hadoop fs -lsr /apress/tweetdata
We are storing the live Twitter stream on the local file system first just for this
sample exercise. With real-time streaming of tweets, you may want to store a live Twit-
ter stream into HDFS using Flume or Scribe. Please refer to ht-
tps://cwiki.apache.org/FLUME/ or https://github.com/face-
book/scribe .
For more information on installation and setup, you can refer to ht-
tp://hadoop.apache.org/docs/stable/single_node_setup.html .
In this example, we discussed how to store live tweets in HDFS. In the following
section, we will explore writing a MapReduce program to store the tweet count of spe-
cific users and dates into the Cassandra column family.
 
Search WWH ::




Custom Search