Database Reference
In-Depth Information
// Run the MapReduce job
JobCreationResults mrJobResults = jobClient.CreateMapReduceJob(mrJobDefinition);
Console.Write("Executing WordCount MapReduce Job.");
// Wait for the job to complete
WaitForJobCompletion(mrJobResults, jobClient);
// Print the MapReduce job output
Stream stream = new MemoryStream();
CloudStorageAccount storageAccount =
CloudStorageAccount.Parse("DefaultEndpointsProtocol=https;AccountName=" +
Constants.storageAccount + ";AccountKey=" + Constants.storageAccountKey);
CloudBlobClient blobClient = storageAccount.CreateCloudBlobClient();
CloudBlobContainer blobContainer =
blobClient.GetContainerReference(Constants.container);
CloudBlockBlob blockBlob =
blobContainer.GetBlockBlobReference("example/data/WordCountOutput/part-r-00000");
blockBlob.DownloadToStream(stream);
stream.Position = 0;
StreamReader reader = new StreamReader(stream);
Console.Write("Done..Word counts are:\n");
Console.WriteLine(reader.ReadToEnd());
}
Add a call to this method in Program.cs and run the program. You should see the job completing with success,
and the words with their counts should be displayed in the console. Thus, the .NET Framework exposes two different
ways to submit MapReduce jobs to your HDInsight clusters: you can write your own .NET MapReduce classes, or you
can choose to run any of the existing ones bundled in .jar files.
Submitting a Hive Job
As stated earlier, Hive is an abstraction over MapReduce that provides a SQL-like language that is internally broken
down to MapReduce jobs. This relieves the programmer of writing the code and developing the MapReduce
infrastructure as described in the previous section.
Adding the References
Launch the NuGet Package Manager Console , and import the Hive NuGet package by running the following
command:
install-package Microsoft.Hadoop.Hive
This should import the required .dll, along with any dependencies it may have. You will see output similar to the
following:
PM> install-package Microsoft.Hadoop.Hive
Attempting to resolve dependency 'Newtonsoft.Json (≥ 4.5.11)'.
Installing 'Microsoft.Hadoop.Hive 0.9.4951.25594'.
Successfully installed 'Microsoft.Hadoop.Hive 0.9.4951.25594'.
Adding 'Microsoft.Hadoop.Hive 0.9.4951.25594' to HadoopClient.
 
Search WWH ::




Custom Search