Databases Reference
In-Depth Information
transactions that occurred while it was down? Who should store these transactions and
where should they be stored? These questions led to a new class of products that spe-
cialize in database replication and synchronization.
Replication is different than sharding, which we discussed in chapter 2. Sharding
stores each record on different processors but doesn't duplicate the data. In addition,
sharding allows reads and writes to be distributed to multiple systems but doesn't
increase system availability. On the other hand, replication can increase availability
and read access speeds by allowing read requests to be performed by slave systems. In
general, replication doesn't increase the performance of write operations to a data-
base. Since data has to be copied to multiple systems, it sometimes slows down total
write throughput rates. In the end, replication and sharding are independent pro-
cesses and in appropriate situations can be used together.
So what should happen if the slave systems crash? It doesn't make sense to have the
master reject all transactions, since it would render the system unavailable for writes if
any slave system crashed. If you allow the master to continue accepting updates, you'll
need a process to resync the slave system when it comes back online.
One common solution to the slave resync problem is to use a completely separate
piece of software called a reliable messaging system or message store , as shown in figure 3.9.
Reliable messaging systems accept messages even if a remote system isn't respond-
ing. When used in a master/slave configuration, these systems queue all update mes-
sages when one or more slave systems are down, and send them on when the slave
system is online, allowing all messages to be posted so that the master and slave
remain in sync.
Replication is a complex problem when one or more systems go offline, even if
only for a short period of time. Knowing exactly what information has changed and
resyncing the changed data is critical for reliability. Without some way of breaking
large databases into smaller subsets for comparison, replication becomes impractical.
This is why using consistent caching NoSQL databases (discussed in chapter 2) may
be a better solution.
NoSQL systems also need to solve the database replication problem, but unlike
relational databases, NoSQL systems need to synchronize not only tables, but other
structures as well, like graphs and documents. The technologies used to replicate
Figure 3.9 Using message stores
for reliable data replication—how
message stores can be used to
increase the reliability of the data
on each slave database, even if the
slave systems are unavailable for a
period of time. When slave systems
restart, they can access an
external message store to retrieve
the transactions they missed when
they were unavailable.
The master writes all update
transactions to a message store.
Master database
Update messages stay in the message
store till all subscribers get a copy
of the message.
Message store
Slave database
Slave database
 
Search WWH ::




Custom Search