Routing API (IPv6 Unicast Routing Protocols) Part 1

There are two major interfaces for BSD variants to get access to the kernel routing table. One is through a generic routing socket, and the other is via the sysctl() library function.

Routing Sockets

A routing socket is a generic socket interface to the kernel’s routing table. Via a routing socket, an application can add or delete a routing table entry, modify an existing entry, or get the entry that would be used for a given destination.

An application and the kernel communicate messages over a routing socket, which begin with common header fields, followed by specific information depending on the message type. From the unicast routing perspective, the most important messages are routing messages, which consist of a fixed format header structure followed by a set of socket address structures. The header of a routing message is the rt_msghdr{} structure, whose definition is shown in Listing 1-3.

Listing 1-3

Listing 1-3

The first three members are common to all messages communicated over a routing socket. The rtm_type member uniquely identifies the purpose of the message. Two message types are particularly important for unicast routing: RTM_ADD for adding a new routing entry to the kernel, and RTM_DELETE for deleting an existing routing entry from the kernel.

Another important member is rtm_addrs, which specifies which types of addresses are to follow the rt_msghdr{} structure. This is a flag bit field as commented. The flags shown in Table 1-7 are commonly used for unicast routing purposes. Among those, RTA_NETMASK and (when present) RTA_DST define a specific prefix, the key of the corresponding routing entry.

In this subsection we show a complete example application that installs an IPv6 routing entry using a routing socket. This is essentially the same as the addroute() function of the route6d program, but we use a separate program here as a complete template for any IPv6 routing application.


Flag name



destination address


gateway address


associated interface address


outgoing interface


network mask for the destination

This program, which we call rtadd6, takes two command-line arguments. The first one is an IPv6 address or prefix, the key of the routing entry. The second argument is the gateway IPv6 address for this entry. As indicated in Listing 1-1 the second argument is usually a link-local IPv6 address. And, in this case, the link identifier must be uniquely specified using the extended textual format as defined in [RFC4007].

For example, the execution


will create a new routing entry for prefix 2001:db8:1234::/48 with the gateway address of fe80::1 on the link attached to interface ne0. Note that rtadd6 requires super-user privilege. Then the netstat output will contain the following line:


Main Function of rtadd6

The following listings cover the entire source code of the rtadd6 program. We begin with the main function of rtadd6, which is located at the lower part of the source file, rtadd6.c.

Open Routing Socket Listing 1-4

Open Routing Socket Listing 1-4

59—66 The rtflags and rtaddrs variables are initialized with the default settings for the rtm_flags and rtm_addrs members of the rt_msghdr{} structure. This program always specifies the destination address and the gateway, and the corresponding flags are set by default. RTF_UP is specified just in case and is actually not necessary; the kernel will automatically set this flag when creating a new entry.

Buffer buf is a placeholder for the routing message. The buffer size (512 bytes) is an arbitrary choice, but is in fact more than enough for the purpose here.

73—77 A routing socket is created as a raw protocol interface.

Parse Destination and Gateway Addresses Listing 1-5

Parse Destination and Gateway Addresses Listing 1-5

79—89 If the first argument to this program contains a "slash," it should be an IPv6 prefix; otherwise it is an IPv6 address. In the former case, the plen2mask() function converts the prefix length (which should follow the slash character) into the corresponding network mask and stores it to variable mask as an IPv6 socket address structure. The RTA_NETMASK bit is set in rtaddrs.

90—91 If the first argument does not contain a "slash," this is a host route, and the RTF_HOST flag is set in rtflags.

98—114 The destination address (which might be the address part of a prefix) and the gateway address are converted to IPv6 socket address structures by the getaddrinfo() library function. If any of the addresses is a link-local address represented in the extended format, getaddrinfo() will interpret it and set the sin6_scope_id member of the resulting socket address structure to the corresponding link index.

Prepare and Send Routing Message Listing 1-6

Prepare and Send Routing Message Listing 1-6

116—123 The base rt_msghdr{} structure is initialized. Since this program is adding a new routing entry, the message type (rtm_type) is RTM_ADD. The rtm_seq and rtm_pid members do not actually have any effect on this program, but are initialized appropriately just in case. The rtm_flags and rtm_addrs members are set to the values as preset above.

125—129 The sin6 variable points to the end of the rt_msghdr{} structure (with appropriate padding so that the pointer is naturally aligned, when necessary). The destination address stored in variable dst is copied there. The convertscope() function is then called in case the address is a link-local address, in which case it should be converted into the kernel internal form (as shown in Figure 1-42 ). Finally, variable sin6 is adjusted so that it points to the end of the socket address structure just filled in. ROUNDUP() is a macro function that adds necessary padding so that the resulting pointer is aligned at the natural boundary, defined as follows:


For a 32-bit machine architecture, this is actually a no-operation, since the size of the sockaddr_in6{} structure (28 bytes) is a multiple of 32 bits.

131—133 Likewise, the gateway address is copied in the succeeding region of the buffer, and the convertscope() function makes possible adjustment for link-local addresses.

135—139 If the network mask is to be specified, mask is copied after the gateway address. Since there is no ambiguity about scope zones for a network mask, convertscope() need not be called here.

Figure 1-44 shows the buffer content after building the rt_msghdr{} structure and all the necessary address structures for the execution example shown above. Notice that the link-local address of the gateway embeds the link index (assuming it is 1 here) in the address field, while we specified the address in the standard format per [RFC4007]. The convertscope() function performed the conversion.

141—147 The rtm_msglen member is set to the total length of the message, and the message is written to the routing socket. The kernel will first check whether a routing entry for the specified destination exists, and will create a new one with the specified gateway if not. Otherwise, the write() system call will fail with the error of EEXIST. For example, if we execute this program with the same arguments as above two times, the second execution will fail as follows:


A careful implementation may thus want to separate this particular error from other general errors (in fact, the route6d program does that).



Next post:

Previous post: