Databases Reference
In-Depth Information
Distribute the request
to the least-busy
node.
Applications get their
data from many
databases.
Listen to all service
requests on a port.
Application
server
NoSQL
database
Incoming
requests
Primary load balancer
Outgoing
responses
Application
server
Failover load balancer
NoSQL
database
Application
server
Indicates what
applications
are healthy.
NoSQL
database
Heartbeat signal
Figure 8.2 A load balancer is ideal when you have a large number of processors that can
each fulfill a service request. To gain performance advantages, all service requests arrive
at a load balancer service that distributes the request to the least-busy processor. A
heartbeat signal from each application server provides a list of which application servers
are working. An application server may request data from one or more NoSQL databases.
8.3.2
Using high-availability distributed filesystems
with NoSQL databases
Most NoSQL systems are designed to work on a high-availability filesystem such as the
Hadoop Distributed File System ( HDFS ). If you're using a NoSQL system such as Cas-
sandra, you'll see that it has its own HDFS compatible filesystem. Building a NoSQL
system around a specific filesystem has advantages and disadvantages.
Advantages of using a distributed filesystem with a NoSQL database:
Reuse of reliable components —Reusing prebuilt and pretested system components
makes sense with respect to time and money. Your NoSQL system doesn't need
to duplicate the functions in a distributed filesystem. Additionally, your organi-
zation may already have an infrastructure and trained staff who know how to set
up and configure these systems.
Customizable per-folder availability —Most distributed filesystems can be config-
ured on a folder-by-folder basis for high availability. This gets around using a
local filesystem with single points of failure to store input or output datasets.
These systems can be configured to store your data in multiple locations; the
default is generally three. This means that a client request would only fail if all
three systems crashed at the same time. The odds of this occurring are low
enough that three are sufficient for most service levels.
Rack and site awareness —Distributed filesystem software is designed to factor in
how computer clusters are organized in your data center. When you set up your
filesystem, you indicate which nodes are placed in which racks with the assump-
tion that nodes within a rack have higher bandwidth than nodes in different
racks. Racks can also be placed in different data centers, and filesystems can
 
Search WWH ::




Custom Search