Database Reference
In-Depth Information
Infrastructure can always be a major investment. Perhaps the most important advan-
tage to using virtualized public-cloud providers for distributed-data applications is the
ability to use a bare minimum of processing without investing in fixed hardware costs.
Even if the data processing and latency can't be absolutely controlled, it may be more
important to keep costs and maintenance time down. Let someone else do this work
for you while you concentrate on solving your data challenge.
There are many cases in which building and maintaining your own hardware is
advantageous. The time it takes to export a great deal of in-house data into a cloud
system (or even from one cloud provider to another) can often be prohibitive. Control
over higher-performance applications might be a useful consideration. If the cost of
maintaining both hardware and staff are acceptable for your organization, it can also
be possible to achieve much better performance-per-price characteristics with in-house
hardware. With the right administrative expertise, the total cost of ownership might
be lower as well. In most other cases, it makes more sense to do whatever it takes to
avoid dealing with the management of hardware.
For many data processing applications, it is advisable to avoid buying or leasing phys-
ical infrastructure whenever possible. The fixed costs of investing in physical hardware
are so great that this solution should only be used when necessary. For distributed-data
applications, always first consider using virtualized systems on a public cloud. Even
when starting a large-scale data project, building a proof of concept using cloud infra-
structure is a good way to test the application without over-investing in hardware.
Understand the Costs of Open-Source
As is commonly said about the open-source software communities, the English lan-
guage doesn't have separate words for “free” as in “freedom” and “free” as in “free of
charge.” Open-source software is always free the way speech is free—but this doesn't
mean that it will always cost nothing to implement. Although it can often be free, as
in “free beer,” there are always costs associated with doing things yourself.
A common criticism (or rather, fear) about open-source software projects focuses
on the myth that there are no support options. This is sometimes true, especially
around bleeding-edge technologies. However, for more mature projects, the developer
and user communities around popular open-source data technologies is vibrant. The
Hadoop tag on popular tech question-and-answer site StackOverf low has well over
6,000 questions on its own, which doesn't even include the hundreds of tagged posts
for related technologies such as “map-reduce,” “hive,” and “hdfs.” Similarly, com-
panies such as Red Hat have shown that it is possible to build viable business models
around support and training involving open-source solutions. The popular open-
source document database MongoDB is supported by the company 10gen, which pro-
vides enterprise support for a fee.
Using open-source software can also be technically rewarding. The experience of
working through installation, use, and even modification of open-source code needed
to solve data problems can enhance overall engineering skills.
 
 
Search WWH ::




Custom Search