Database Reference
In-Depth Information
HDInsight is a secure-by-default configuration. It disables access to the
cluster via Remote Desktop by default and also provides (for free) a secure
gateway virtual machine that performs the authorization/authentication
and exposes endpoints on port 443. Similarly, you can also retrieve Ambari
metrics using the REST API via the secure gateway on port 443.
HDInsight is using HDP as its base distribution, but limitations apply. For
example, at the moment, HBase is not included in HDInsight. However, on
the plus side, it means complete ubiquity. You can take your data set and
export it from HDInsight if you want and install it in an on-premise or IAAS
installation running HDP, and it will just work.
In late 2013, Microsoft released HDInsight to general availability. It
currentlyisnotinstalledinallAzuredatacenters,butIexpectthattochange
as the space is provisioned in each data center.
Infrastructure as a Service (IAAS)
Much of what can be achieved currently with HDInsight and PAAS can also
be achieved using a combination of IAAS and scripting. You can automate
the process of creating your environment and also elastically adjust the
resources each time you spin up a Hadoop cluster. Granted, you will have to
roll your own scripts to build this environment, but you will have ultimate
flexibility in your deployment during the installation process. You may find
that it takes you more time to configure and build your IAAS environment
as opposed to using the PAAS option with HDInsight.
Note, however, that some additional tasks do become your responsibility.
For example, the secure node in HDInsight isn't automatically provisioned,
and neither is the metastore. Also, if you want to save money and keep hold
ofyourdata, youwill need tohave yourownautomated process ofdetaching
disks,deletingvirtualmachinesandperforming anyadditional cleanup.The
same clearly is also true (in the inverse) for creating the cluster.
Interestingly enough, the IAAS option opens up the opportunity to run
Linux or Windows virtual machines inside of Windows Azure. Therefore,
you can actually run any flavor of Hadoop you like on Windows Azure
includingthelatestandgreatestversionsofHortonworksorevenCloudera's
Impala. You simply aren't constrained to Windows as an operating system
on the Microsoft data platform. If you said that to me a few years ago, I'd
have looked at you somewhat quizzically. However, the world has simply
moved on. Cloud platforms are more interested in managing the compute
Search WWH ::




Custom Search