Network Practices - Expert Oracle RAC 12c

Database Reference

In-Depth Information

Protocols

There are a few higher-level protocols such as TCP, UDP, and RDS used in a typical RAC cluster. 4 This section covers

various protocols and their pros and cons.

TCP/IP is a stateful protocol. A connection between the sender and receiver must be established before sending

a segment. Transmission of every segment requires an acknowledgement (TCP ACK) before a transmission is

considered complete. For example, after sending a TCP/IP segment from one IP address and port number to another

IP address and port number, kernel waits for an acknowledgement before declaring that transmission as complete.

UDP is a stateless protocol. No existing connection is required to send a datagram. A transmission is considered

complete as soon as frames leave the network interface. No ACK required at all; it is up to the application to perform

error processing. For example, if a UDP packet is lost in transmission, RAC processes re-request the packet. The UDP

protocol layer is built upon the IP layer. However, both UDP and TCP/IP protocols have the overhead of double copy

and double buffering, as the segments can be sent only after copying the datagrams from the user space to kernel

space and received packets are processed in the kernel space and copied into user space.

The RDS (Reliable Datagram Socket) protocol requires specific hardware (InfiniBand fabric) and kernel drivers

to implement. With the RDS protocol, all error handling is offloaded to the InfiniBand fabric, and the transmission

is considered complete as soon as the frame reaches the fabric. The RDS protocol is used in the Exadata platform,

providing lower latency and lower resource usage. Similar to UDP, there is no ACK mechanism in the RDS protocol.

Further, RDS is designed as a zero-copy protocol, and the messages can be sent or received without a copy operation.

The RDS protocol does not use IP layer functions and bypasses the IP layer completely.

The UDP protocol is employed for cache fusion on Unix and Linux platforms. On Exadata platforms, the RDS

protocol is used for cache fusion. You can implement the RDS protocol on non-Exadata platforms too. At the time of

writing, InfiniBand fabric hardware and RDS kernel drivers are available in some flavors of Unix and Linux. There are

also vendor-specific protocols: for example, the LLT protocol is used for cache fusion with Veritas SFRAC.

Clusterware uses TCP/IP for heartbeat mechanism between nodes. While UDP stands for User Datagram

Protocol, it is sometimes, in a lighter vein, referred as the Unreliable Datagram Protocol. However, it does not mean

that UDP will suffer from data loss; thousands of customers using UDP in the Unix platform are proof that UDP

doesn't affect the reliability of an application. In essence, the UDP protocol is as reliable as the network underneath.

While there are subtle differences between UDP and other protocols, UDP processing is simpler and easier to

explain. In Figure 9-3 , a UDP function stack is shown on a Linux platform. It is not necessary to understand the details

of these function calls (after all, this chapter is not about network programming); just a high-level understanding of

function execution flow is good enough. Application processes call udp_send system call; udp_sendmsg calls IP layer

functions; IP layer function calls the device driver functions, recursively. On the receiving side, kernel threads call IP

layer functions and then UDP layer functions, and then the application process is scheduled in the CPU to drain the

socket buffers to application buffers.

4 Other protocols may be in use in a third-party cluster. For example, a RAC database uses LLT protocol in a Veritas SFRAC cluster.

Search WWH ::

Custom Search

Home