Database Reference
In-Depth Information
“Virtualizing Apache Hadoop”
Docker
License
Apache License, Version 2.0
Activity
High
Purpose
Container to run apps, including Hadoop nodes
Official Page
https://www.docker.com
Hadoop Integration No Integration
You may have heard the buzz about Docker and containerized applications. A little history
may help here. Virtual machines were a large step forward in both cloud computing and in-
frastructure as a service (IaaS). Once a Linux virtual machine was created, it took less than a
minute to spin up a new one, whereas building a Linux hardware machine could take hours.
But there are some drawbacks. If you built a cluster of 100 VMs, and needed to change an
OS parameter or update something in your Hadoop environment, you would need to do it on
each of the 100 VMs.
To understand Docker's advantages, you'll find it useful to understand its lineage. First came
chroot jails, in which Unix subsystems could be built that were restricted to a smaller
namespace than the entire Unix OS. Then came Solaris containers in which users and pro-
grams could be restricted to zones, each protected from the others with many virtualized re-
sources. Linux containers are roughly the same as Solaris containers, but for the Linux OS
rather than Solaris. Docker arose as a technology to do lightweight virtualization for applica-
tions. The Docker container sits on top of Linux OS resources and just houses the application
code and whatever it depends upon over and above OS resources. Thus Dockers enjoys the
resource isolation and resource allocation features of a virtual machine, but is much more
portable and lightweight. A full description of Docker is beyoond the scope of this topic, but
recently attempts have been made to run Hadoop nodes in a Docker environment.
Search WWH ::




Custom Search