Databases Reference
In-Depth Information
can predict and design for all possible user data access scenarios. Therefore design for as
much flexibility as possible but beware you can't design for all possible scenarios. That's
the reason why polyglot persistence is going to become increasingly important.
Rule 8: Loading: There are two circumstances in which the loading capability of the
database will be relevant: either because you have large amounts of data to be loaded
or because you need to load data in real-time or near real-time. In some circumstances
you may have to design for both the scenarios. You will have to evaluate a product that
supports a high ingestion rate and (near) real-time capabilities or both. In terms of
raw loading capacity this is simply a question of the size of the pipe into the database,
bearing in mind any parallelism that is provided. Real-time loading requires support for
the ability to micro-batch data (say, batches of one minute) or explicit trickle feeding
mechanisms such as change data capture or streaming capability.
Rule 9: Complex Analytics: By “complex analytics” we do not always mean that the
questions customers want to ask are complex; but even simple queries (i.e., full table
scans, large table joins, etc.) can bring the database to a grinding halt. While there is no
formal definition of what constitutes a complex query, they typically involve such things
as multi-way and multi-table joins, whole table scans, correlated sub-queries, and other
functions that are either computer intensive, I/O intensive or both. Your solution has
to be able to perform such queries in a timely manner, and you'll therefore require a
database product that can cope with such a workload, also bearing in mind that these
queries may be ad hoc and must perform to expectations.
Rule 10: Scalability: If there's one thing you need to be worried about it is increasing
volumes of data. Whatever solution you choose needs to be able to easily scale as data
volumes grow. It is not just a question of being able to store larger amounts of data; it is
also about how quickly you can ingest data from multiple sources. Moreover, it is likely
that more queries will be run by more users, as the value of your analytic application or
platform becomes apparent to the users, hence the database will also need to be scalable
in terms of the user concurrency.
References
THE DATABASE REVOLUTION: A Perspective On Database: Where We Came From and
Where We're Going: The Bloor Group
10 Rules: Embedding a Database for High Performance Reporting and Analytics: Bloor
Research
martinfowler.com/articles/nosql-intro.pdf
NoSQL Distilled - A Brief Guide to the Emerging World of Polyglot Persistence: Pramod J Sadalage,
Martin Fowler
 
Search WWH ::




Custom Search