Database Reference
In-Depth Information
proprietary components? No one knows. Importantly, though, the
precedent has been set. As product companies look to monetize their
investment, it seems inevitable that there will ultimately be more
proprietary products built on top of Hadoop.
Microsoft's foray into the world of big data and open source solutions (OSS)
has also overlapped with the even broader, even more strategic shift in focus
to the cloud with Windows Azure. This has led to some very interesting
consequences for the big data strategy that would have otherwise never
materialized. Have you ever considered Linux to be part of the Microsoft
data platform? Neither had I!
With these thoughts in your mind, I now urge you to read on and learn more
about this fascinating ecosystem. Understand Microsoft's relationship with
the open source world and get insight on your deployment choices for your
Apache Hadoop cluster.
NOTE
If you want to know more about project Dryad, this site provides a great
starting point: http://research.microsoft.com/en-us/projects/dryad/ .
You will notice some uncanny similarities.
Competition in the Ecosystem
Just because Hadoop is an open source series of projects doesn't mean for
one moment that it is uncompetitive. Quite the opposite. In many ways, it is
a bit like playing cards but with everyone holding an open hand; everyone
can see each other's cards. That is, until they can't. Many systems use open
source technology as part of a mix of components that blend in proprietary
extensions. These proprietary elements are what closes the hand and fuels
thecompetition. Wewillseeanexample ofthislaterinthischapterwhenwe
look at Cloudera's Impala technology.
Hadoop is no exception. To differentiate themselves in the market,
distributors of Hadoop have opted to move in different directions rather
than collaborate on a single project or initiative. To highlight how this is all
playing out, let's focus on one area: SQL on Hadoop. No area is more hotly
Search WWH ::




Custom Search