Databases Reference
In-Depth Information
thousands, in fact—but if you go that far, you'll probably also need to configure your
operating system's TCP networking settings. On GNU/Linux systems, you need to
increase the somaxconn limit from its default of 128 , and check the tcp_max_syn_back
log settings in sysctl (there's an example a bit later in this section).
You need to design your network for good performance, rather than just accepting
whatever you get by default. To begin, analyze how many hops are between the nodes,
and map the physical network layout. For instance, suppose you have 10 web servers
connected to a “Web” switch via gigabit Ethernet (1 GigE), and this switch is connected
to the “Database” switch via 1 GigE as well. If you don't take the time to trace the
connections, you might never realize that your total bandwidth from all database
servers to all web servers is limited to a gigabit! Each hop adds latency, too.
It's a good idea to monitor network performance and errors on all network ports.
Monitor every port on servers, on routers, and on switches. The Multi Router Traffic
Grapher, or MRTG ( http://oss.oetiker.ch/mrtg/ ) , is the tried-and-true open source sol-
ution for device monitoring. Other common tools for monitoring network performance
(as opposed to devices) are Smokeping ( http://oss.oetiker.ch/smokeping/ ) and Cacti
( http://www.cacti.net ).
Physical separation matters a lot in networking. Inter-city networks will have much
worse latency than your data center's LAN, even if the bandwidth is technically the
same. If the nodes are really widely separated, the speed of light actually matters. For
example, if you have data centers on the west and east coasts of the US, they'll be
separated by about 3,000 miles. The speed of light is 186,000 mps, so a one-way trip
cannot be any faster than 16 ms, and a round-trip takes at least 32 ms. The physical
distance is not the only performance consideration, either: there are devices in between
as well. Repeaters, routers, and switches all degrade performance somewhat. Again,
the more widely separated the network nodes are, the more unpredictable and unreli-
able the links will be.
It's a good idea to try to avoid real-time cross-data center operations as much as pos-
sible. 11 If this isn't possible, you should make sure your application handles network
failures gracefully. For example, you don't want your web servers to fork too many
Apache processes because they are all stalled trying to connect to a remote data center
over a link that has significant packet loss.
At the local level, use at least 1 GigE if you're not already. You might need to use a 10
GigE connection for the backbone between switches. If you need more bandwidth than
that, you can use network trunking : connecting multiple network interface cards (NICs)
to get more bandwidth. Trunking is essentially parallelization of networking, and it can
be very helpful as part of a high-availability strategy.
11. Replication doesn't count as a real-time cross-data center operation. It's not real-time, and it's often a
good idea to replicate your data to a remote location for safety. We cover this more in the next chapter.
 
Search WWH ::




Custom Search