Database Reference
In-Depth Information
Chapter 4
Automating HDInsight Cluster
Provisioning
It is almost always a requirement for a business to automate activities that are repetitive and can be predicted well in
advance. Through the strategic use of technology and automation, an organization can increase its productivity and
efficiency by automating recurring tasks associated with the daily workflow. Apache Hadoop exposes Java interfaces
for developers to programmatically manipulate and automate the creation of Hadoop clusters.
Microsoft .NET Framework is part of the automation picture in HDInsight. Existing .NET developers can now
leverage their skillset to automate workflows in the Hadoop world. Programmers now have the option to write
their MapReduce jobs in C# and VB .NET. Additionally, HDInsight also supports Windows PowerShell to automate
cluster operations through scripts. PowerShell is a script-based workflow and is a particular favorite of Windows
administrators for scripting their tasks. There is also a command-based interface based on Node.js to automate
cluster-management operations. This chapter will discuss the various ways to use the Hadoop .NET Software
Development Kit (SDK), Windows PowerShell, and the cross-platform Command-Line Interface (CLI) tools to
automate HDInsight service cluster operations.
Using the Hadoop .NET SDK
The Hadoop .NET SDK provides .NET client API libraries that make it easier to work with Hadoop from .NET. Since
all of this is open source, the SDK is hosted in the open source site CodePlex and can be downloaded from the
following link:
http://hadoopsdk.codeplex.com/
CodePlex uses NuGet packages to help you easily incorporate components for certain functions. NuGet is a
Visual Studio extension that makes it easy to add, remove, and update libraries and tools in Visual Studio projects
that use the .NET Framework. When you add a library, NuGet copies files to your solution and automatically adds
and updates the required references in your app.config or web.config file. NuGet also makes sure that it reverts
those changes when the library is dereferenced from your project so that nothing is left behind. For more detailed
information, visit the NuGet documentation site:
http://nuget.codeplex.com/documentation
 
Search WWH ::




Custom Search