Database Reference
In-Depth Information
At its core, Hadoop has two primary functions:
• Processing data (MapReduce)
• Storing data (HDFS)
With the advent of Hadoop 2.0, the next major release of Hadoop, we will
seethedecoupling ofresource management fromdataprocessing. Thisadds
a third primary function to this list. However, at the time of this writing,
Yarn, the Apache project responsible for the resource management, is in
alpha technology preview modes.
That said, a number of additional subprojects have been developed and
added to the ecosystem that have been built on top of these two primary
functions. When bundled together, these subprojects plus the core projects
of MapReduce and HDFS become known as a distribution .
Derivative Works and Distributions
To fully understand a distribution, you must first understand the role,
naming, and branding of Apache Hadoop. The basic rule here is that only
official releases by the Apache Hadoop project may be called Apache
Hadoop or Hadoop . So, what about companies that build products/
solutions on top of Hadoop? This is where the term derivative works comes
in.
What Are Derivative Works?
Any product that uses Apache Hadoop code, known as artifacts , as part
of its construction is said to be a derivative work . A derivative work is
not an Apache Hadoop release. It may be true that a derivative work can
be described as “powered by Apache Hadoop.” However, there is strict
guidance on product naming to avoid confusion in the marketplace.
Consequently, companies that provide distributions of Hadoop should also
be considered to be derivative works.
Search WWH ::




Custom Search