Database Reference
In-Depth Information
manages Hadoop jobs used for the bulk ingress or egress. It is controlled by
a separate client process.
Lambda Architectures
One of the key features all of the storage systems discussed in this chapter
possessistheabilitytotunetheirupdatemechanism.Practically,thismeans
abandoning transactions in favor of write speed. Combined with the
at-least-once nature of the data-flow mechanisms, this can lead to both
under-counting and over-counting of aggregates and other related
problems.
To overcome this problem, Nathan Marz introduced the concept of the
Lambda Architecture. In this architecture, the real-time system updates its
data stores as before, without regard to transaction. It is accepted that these
values are only an approximation of the true values.
The final values are computed using the data warehousing system. In most
examples, these final values are computed by large Hadoop batch jobs. The
front-end interface then manages the selection between the two systems.
The other approach is to use the same database to store both and overwrite
the real-time values with the values from the batch system as they become
available. This requires fewer resources and is easier to manage on the front
end, but it can be harder to implement. It is also often desirable to use
different storage technologies for the short-term storage of the real-time
system and the long-term storage of the “final” results coming from the
batch system.
Conclusion
Thischapterhascoveredthebasicsofsomeofthemorepopulardatastorage
options available. Essentially all storage options work in a given situation
givenenougheffort,butsomemakemoresenseforcertainapplicationsthan
others.Thisisoftennotaneasydecision,soitisoftenbesttotryafewthings
outandexperiment.Thefirstattemptatscalingtoavailabledatawillusually
eliminate at least a few options.
Now that data is streaming into a processing system and that processing
system has someplace to put its output, it is time to build some applications.
The next chapter puts a face on the data by building a simple dashboard
Search WWH ::




Custom Search