Database Reference
In-Depth Information
// Run the MapReduce job
JobCreationResults mrJobResults = jobClient.CreateMapReduceJob(mrJobDefinition);
Console.Write("Executing WordCount MapReduce Job.");
// Wait for the job to complete
WaitForJobCompletion(mrJobResults, jobClient);
The final step after the job submission is to read and display the stream of output from the blob storage. The
following piece of code does that:
Stream stream = new MemoryStream();
CloudStorageAccount storageAccount = CloudStorageAccount.Parse(
"DefaultEndpointsProtocol=https;AccountName="
+ Constants.storageAccount
+ ";AccountKey="
+ Constants.storageAccountKey);
CloudBlobClient blobClient = storageAccount.CreateCloudBlobClient();
CloudBlobContainer blobContainer =
blobClient.GetContainerReference(Constants.container);
CloudBlockBlob blockBlob = blobContainer.GetBlockBlobReference("example/data/
WordCountOutput/part-r-00000");
blockBlob.DownloadToStream(stream);
stream.Position = 0;
StreamReader reader = new StreamReader(stream);
Console.Write("Done..Word counts are:\n");
Console.WriteLine(reader.ReadToEnd());
The entire DoMapReduce() method should look similar to Listing 5-9.
Listing 5-9. DoMapReduce() method
public static void DoMapReduce()
{
// Define the MapReduce job
MapReduceJobCreateParameters mrJobDefinition = new MapReduceJobCreateParameters()
{
JarFile = "wasb:///example/jars/hadoop-examples.jar",
ClassName = "wordcount"
};
mrJobDefinition.Arguments.Add("wasb:///example/data/gutenberg/davinci.txt");
mrJobDefinition.Arguments.Add("wasb:///example/data/WordCountOutput");
//Get certificate
var store = new X509Store();
store.Open(OpenFlags.ReadOnly);
var cert = store.Certificates.Cast<X509Certificate2>().First(item
=> item.Thumbprint == Constants.thumbprint);
var creds = new JobSubmissionCertificateCredential(
Constants.subscriptionId, cert, Constants.clusterName);
// Create a hadoop client to connect to HDInsight
var jobClient = JobSubmissionClientFactory.Connect(creds);
 
Search WWH ::




Custom Search