Databases Reference
In-Depth Information
However, these performance improvement considerations are not feasible to be applied
to all possible scenarios. Different workloads demand different kinds of performance
improvement considerations; at the time of design, you can't possibly articulate all
possible data access scenarios.
For all of these reasons, just choosing a relational database will not be sufficient
enough. You will have to look at a combination of databases to solve the business
problem.
Rule 2: Ease of Implementation: There are two aspects to this, the first being that
the database should be easy to install and, secondly, the resulting application with its
underlying database, is easy to deploy. In particular, there should be no requirement
for the end user to go through the painful process of understanding the technology
architecture components to configure any of the database elements during the
implementation process.
Rule 3: High Performance: Everybody expects top-notch performance from
their applications; it is all the more important for analytic implementations than for
transactional ones. For analytics workloads, it is not easy to predict which particular
analysis or complex queries the users are going to run at any particular time. In a
transactional environment, on the other hand, you know (roughly) the type of queries
that are run and the expected throughput performance that has to be catered to. However,
when it comes to analytics there are not only ever-increasing amounts of data available to
analyze but also new types of data that might be appropriate to include in queries. Thus,
resolving scalability issues is of paramount importance.
Rule 4: High Availability: High availability is always a potential requirement
whenever an application is deemed to be “mission critical.” The question, of course,
is whether analytic applications are regarded as mission critical, and the answer is
that it depends on the application and the user. For example, if you have a real-time
requirement for security event monitoring, or fraud detection, then high availability is
likely to be essential. On the other hand, if you are using analytics to support some sort of
customer intelligence application used on a periodic basis, then high availability may not
be a critical requirement. The conclusion therefore must be that in some environments it
is a must have while in others it is a nice to have. But high availability comes at the cost of
other considerations like consistency of data and partition tolerance.
Rule 5: Low Cost: The requirement for low costs is increasingly becoming a hot
topic on CIO's agenda. This is not just to the license cost to the software provider but also
about the hardware requirements. If you are dealing with big data scales and you need
scalability and high availability options, it will be clearly advantageous if the database is
highly distributed and runs on low-cost commodity hardware.
Rule 6: Ease of Migration: This won't apply in every case because sometimes new
analytic applications are being built rather than existing solutions being ported to a new
platform. However, where this does apply, the ease and speed with which the migration
can be implemented will be a major factor. There are a number of vendors that support
specific capabilities to port from one or other of these environments and ensure the
existing applications should run without change and that database schemas can be
directly imported into the new environment.
Rule 7: Flexibility: Do you want to offer an environment in which the users can only
query what you have pre-prepared for them, or do you want to allow them to make ad hoc
or train-of-thought inquiries that go beyond any pre-defined path? While it is always good
to provide a comprehensive set of out-of-the-box analytic functions as possible, nobody
 
Search WWH ::




Custom Search