Databases Reference
In-Depth Information
In the shared-nothing model each server contains a portion of the database, and
no server contains the entire database. It is designed to process as much data possible at
each node and share data between nodes only when necessary. Although the database
runs independently on multiple nodes, it appears as a single entity to any application.
This model resolves the core limitation of I/O bottlenecks facing single and clustered
servers. Adding a node to a shared-nothing database increases the processors and memory
available and, more importantly, the disk bandwidth as well. A group of small servers can
easily outstrip the total I/O throughput of a very large server or shared disk cluster.
Scaling in this way also lowers the overall hardware cost because commodity servers
can be used. A collection of small servers with the same total amount of processors,
memory, and storage is less expensive than a single large server. See Table 4-4 . We've spent
a good deal of time discussing database evolution and various database technologies
suitable for different type of workloads. Below are a number of conclusions regarding
database architectures:
Table 4-4. Scale up and scale out considerations
Scaling up a Database Platform
Scale Up
Scale Out
Vertical expansion/Upgrade to more
powerful server configuration
Horizontal expansion through
a grid or cluster of commodity servers
More expensive hardware
Less expensive hardware
Eventually hits a limit
Less likely to hit a limit
RDBMS databases based on the relational model still fit the need for
most database implementations, but they have reached scalability
limits, making them either impractical or too expensive for
specialized workloads. New entrants to the market and alternative
approaches are often better suited to specific workloads.
The relational database is still the preferred choice for most
applications today. Database preferences are changing,
particularly for new applications that have high scalability
requirements for data size or user concurrency. If you find
yourself working with a system that has specific needs, let the
workload be your primary guide.
When analyzing the workloads, be sure to consider all the
components. For example, if you run a consumer-facing website
on the database but also want to analyze data using machine-
learning algorithms, you are dealing with two distinct workloads.
One requires real-time read-write activity, and the other requires
heavy read-intensive and computational activity. These are
generally incompatible within the same database without careful
design considerations.
Search WWH ::




Custom Search