Database Reference
In-Depth Information
InfiniBand architecture has the following communication characteristics:
User-level access to message passing
Remote Direct Memory Access (RDMA) in read and write mode
Up to a maximum of 2Gb message in a single transfer
The memory protection mechanism defined by the InfiniBand architecture allows an InfiniBand HCA to transfer
data directly into or out of an application buffer. To protect these buffers from unauthorized access, a process called
memory registration is employed. Memory registration allows data transfers to be initiated directly from user mode,
eliminating costly context switches to the kernel. Another benefit of allowing the InfiniBand, RDMA allows direct
memory access, bypassing the memory in the O/S. This eliminates the context switches to the kernel and eliminates
the need to copy data to or from system buffers on a send or receive operation, respectively.
InfiniBand architecture also has another unique feature called a memory window. The memory window provides
a way for the application to grant remote read and/or write to a specified buffer at a byte-level granularity to another
application. Memory windows are used in conjunction with RDMA read or RDMA write to control remote access to
the application buffers. Data could be transferred either by the push or pull method, that is, either the sending node
would send (push) the data over to the requester or the requester could get to the holder and get (pull) the data.
Table 14-1 lists the throughput differences between the two types of interconnect protocols.
Table 14-1. Interconnect Throughput
Interconnect Type
Throughput
Gigabit Ethernet
80 Megabits per second
InfiniBand
160 Megabits per second
Oracle supports InfiniBand using the reliable datagram socket (RDS) protocol. This protocol multiplexes UDP
packets over InfiniBand connection, improving performance in an Oracle RAC environment.
RDS is a reliable-socket off-load driver and inter-processor communication (IPC) protocol with low overhead,
low latency, and high bandwidth. RDS enables enhanced application performance and cluster scalability. RDS
over InfiniBand uses approximately 50% less CPU per operation than IPoIB (Internet Protocol over InfiniBand) and
operates with approximately half the latency of UDP over Ethernet.
Network Throughput and Bandwidth
Bandwidth refers to the amount of bandwidth currently available on the network, whereas available throughput is
the throughput actually possible, given the end-system hardware (CPU speed and load, network interface card [NIC],
I/O bus speed, disk speed), O/S, TCP stack, TCP parameters, and so on. We now look at some network-related
tuning options.
Tuning Network Buffer Sizes
As a basic installation and configuration requirement, network buffer sizes discussed in the Oracle installation
documents are the bare minimum required for RAC functioning. Monitoring and measuring network latencies can
help increase these buffer sizes even further provided the O/S supports such an increase.
TCP protocol uses a congestion window scheme to determine how many packets can be transmitted at any one
time. The maximum congestion window size is determined by how much buffer space the kernel has allocated for
each socket. If the buffers are too small, the TCP congestion window will never completely open; and on the other hand,
if the buffers are too large, the sender can overrun the receiver, causing the TCP window to shut down. The common
 
 
Search WWH ::




Custom Search