Operating System and Hardware Optimization - High Performance MySQL

Databases Reference

In-Depth Information

thousands, in fact—but if you go that far, you'll probably also need to configure your

operating system's TCP networking settings. On GNU/Linux systems, you need to

increase the somaxconn limit from its default of 128 , and check the tcp_max_syn_back

log settings in sysctl (there's an example a bit later in this section).

You need to design your network for good performance, rather than just accepting

whatever you get by default. To begin, analyze how many hops are between the nodes,

and map the physical network layout. For instance, suppose you have 10 web servers

connected to a “Web” switch via gigabit Ethernet (1 GigE), and this switch is connected

to the “Database” switch via 1 GigE as well. If you don't take the time to trace the

connections, you might never realize that your total bandwidth from all database

servers to all web servers is limited to a gigabit! Each hop adds latency, too.

It's a good idea to monitor network performance and errors on all network ports.

Monitor every port on servers, on routers, and on switches. The Multi Router Traffic

Grapher, or MRTG ( http://oss.oetiker.ch/mrtg/ ) , is the tried-and-true open source sol-

ution for device monitoring. Other common tools for monitoring network performance

(as opposed to devices) are Smokeping ( http://oss.oetiker.ch/smokeping/ ) and Cacti

( http://www.cacti.net ).

Physical separation matters a lot in networking. Inter-city networks will have much

worse latency than your data center's LAN, even if the bandwidth is technically the

same. If the nodes are really widely separated, the speed of light actually matters. For

example, if you have data centers on the west and east coasts of the US, they'll be

separated by about 3,000 miles. The speed of light is 186,000 mps, so a one-way trip

cannot be any faster than 16 ms, and a round-trip takes at least 32 ms. The physical

distance is not the only performance consideration, either: there are devices in between

as well. Repeaters, routers, and switches all degrade performance somewhat. Again,

the more widely separated the network nodes are, the more unpredictable and unreli-

able the links will be.

It's a good idea to try to avoid real-time cross-data center operations as much as pos-

sible. 11 If this isn't possible, you should make sure your application handles network

failures gracefully. For example, you don't want your web servers to fork too many

Apache processes because they are all stalled trying to connect to a remote data center

over a link that has significant packet loss.

At the local level, use at least 1 GigE if you're not already. You might need to use a 10

GigE connection for the backbone between switches. If you need more bandwidth than

that, you can use network trunking : connecting multiple network interface cards (NICs)

to get more bandwidth. Trunking is essentially parallelization of networking, and it can

be very helpful as part of a high-availability strategy.

11. Replication doesn't count as a real-time cross-data center operation. It's not real-time, and it's often a

good idea to replicate your data to a remote location for safety. We cover this more in the next chapter.

Search WWH ::

Custom Search

Home