Introduction to BGP4+ (IPv6 Unicast Routing Protocols) Part 2

NOTIFICATION message

The NOTIFICATION message is sent when an error condition is detected by a BGP speaker.

The BGP speaker terminates the connection immediately after sending the message. The NOTIFICATION message contains the message header and the additional fields that are shown in Figure 1-20.

Error Code This 1-byte field indicates the type of error that has occurred either during the peering process or during an established BGP session.

Error Subcode The value of this 1-byte field depends on the value of the Error Code field.

Data The Data field is variable in length and its content depends on both the Error Code and the Error Subcode. At a minimum the NOTIFICATION message is 21 bytes in size if Data is not present.

The various error codes are listed in Table 1-3.

TABLE 1-3

Error Code

Description

Subcode

Description

1

Message Header Error

1

Connection Not Synchronized

2

Bad Message Length

3

Bad Message Type

2


OPEN Message Error

1

Unsupported Version Number

2

Bad Peer AS

3

Bad BGP Identifier

4

Unsupported Optional Parameters

5

Deprecated

6

Unacceptable Hold Time

3

UPDATE Message Error

1

Malformed Attribute List

2

Unrecognized Well-known Attribute

3

Missing Well-known Attribute

4

Attribute Flags Error

5

Attribute Length Error

6

Invalid ORIGIN Attribute

7

Deprecated

8

Invalid NEXT_HOP Attribute

9

Optional Attribute Error

10

Invalid Network Field

11

Malformed AS_PATH

4

Hold Time Expired

5

Finite State Machine Error

6

Cease

UPDATE message

Routing information are exchanged between BGP peers through the UPDATE message. The UPDATE message may advertise a new route, update an existing route, or withdraw a route.

The format of the UPDATE message is shown in Figure 1-21.

Withdrawn Routes Length This 2-byte field specifies the size of the Withdrawn Routes field in bytes. This field is an IPv4 specific field and is not used by other protocols such as IPv6.

Withdrawn Routes This is a variable sized field and contains a list of routes that are withdrawn from service, perhaps due to change in reachability. This field is an IPv4 specific field and is not used by other protocols such as IPv6. For IPv6, the MP_UNREACH_NLRI path attribute is used to withdraw IPv6 routes. The MP_UNREACH_NLRI attribute is described in detail in Section 1.5.4. Path attributes are described in Section 1.5.3. Each entry has the <length, prefix > format.

FIGURE 1-10

FIGURE 1-10  

Length This 1-byte field specifies the size of the prefix in bits that immediately follows this field. 0 is a special value indicating that the prefix matches all IP addresses, that is, the prefix has all zero values in every byte.

Prefix This field contains the prefix to be withdrawn. The Prefix field is variable in size. This field may be padded such that the prefix is aligned on the byte boundary. For example, if the Length field contains the value 19, then the Prefix field will be 3 bytes large.

Total Path Attributes Length This 2-byte field specifies the size of the Path Attributes field in bytes. The Path Attributes field and the Network Layer Reachability Information field are not present if this field has a 0 value.

Path Attributes This is a variable length field and has the <type, length, value > format. This field describes the properties of a path to a destination that is given in the Network Layer Reachability Information field. We will defer the discussion of this field to Section 1.5.3.

Network Layer Reachability Information (NLRI) This variable length field contains a list of destinations that are reachable and should be added into the local routing table. The paths to these destinations share the same set of properties that are described by the Path Attributes field. The NLRI field is an IPv4-specific field and is not used by other protocols such as IPv6. For IPv6, the NLRI is conveyed through the MP_REACH_NLRI path attribute, which is described in detail in Section 1.5.4. Each entry in the NLRI field has the <length, prefix> format.

Length This 1-byte field specifies the size of the prefix in bits that immediately follows this field. 0 is a special value indicating that the prefix matches all IP addresses, that is, the prefix has all zero values in every byte.

Prefix This field contains the reachable prefix. The Prefix field is variable in size. This field may be padded such that the prefix is aligned on the byte boundary. For example, if the Length field contains the value 19, then the Prefix field will be 3 bytes large.

Path Attributes

Path Attributes describe the various properties of the routes to which the attributes apply. The BGP route selection algorithm includes the path attributes in its computation for the best route as we will describe in Section 1.5.5. The Path Attributes are classified into four categories:

• Well-known mandatory

• Well-known discretionary

• Optional transitive

• Optional non-transitive

The well-known mandatory attributes must be recognized and processed by every BGP speaker. The well-known mandatory attributes must be included in all UPDATE messages that contain the NLRI. The well-known discretionary attributes must also be recognized and processed by every BGP speaker, but these discretionary attributes may be omitted in an UPDATE message. A BGP speaker that receives either types of well-known attributes and subsequently modifies the attributes must then propagate these attributes to its peers in the UPDATE messages.

A BGP speaker is not required to support the optional transitive and optional non-transitive attributes. These types of attributes may be omitted in the UPDATE messages. A BGP speaker should accept NLRI with unrecognized optional transitive attributes, in which case the unrecognized optional transitive attributes are redistributed along with the received NLRI to the peers. A BGP speaker must silently ignore unrecognized optional non-transitive attributes and these attributes are not redistributed to the peers.

The Path Attributes are encoded in the <type, length, value> format. The attribute type is a two-byte field that is divided into Attribute Flags and Attribute Type Code as shown in Figure 1-22.

FIGURE 1-22

FIGURE 1-22

The O bit is the Optional bit, and if it is set to 1, the attribute is an optional attribute. The T bit is the Transitive bit, and if it is set to 1 then the attribute is an optional transitive attribute. The P bit is the Partial bit. When a BGP speaker receives an unrecognized optional non-transitive attribute and the BGP speaker decides to accept the associated NLRI, the BGP speaker sets the P bit before redistributing the unknown attributes to its peers. The P bit must not be reset to 0 once it is set to 1. The E bit is the Extended Length bit, and if it is set to 1, the third and fourth bytes are used for the attribute length field; otherwise only the third byte belongs to the length field. The remaining four bits are unused. The Type Code field contains the attribute type and each type is described in the following:

ORIGIN The ORIGIN attribute specifies the source of the prefix, which can be IGP indicates the prefix was obtained from an interior gateway protocol (IGP). EGP indicates the prefix was obtained from an exterior gateway protocol (EGP). INCOMPLETE indicates the source of the prefix is neither IGP nor EGP but by other methods, for example, through manual route injection by the system administrator. The ORIGIN attribute is a well-known mandatory attribute.

AS_PATH The AS_PATH attribute contains the list of AS path segments. Each path segment contains the AS that the route has traversed. The BGP speaker inserts its ASN in the AS_PATH when it redistributes a route to its external peers. The AS_PATH attribute is a well-known mandatory attribute that is used by each BGP speaker to detect routing loops. The AS_PATH attribute allows the route selection algorithm to choose the shortest path route when multiple routes have the same properties.

NEXT_HOP The NEXT_HOP attribute specifies the next hop router for reaching the prefixes that are provided in NLRI. This is a mandatory well-known attribute. This attribute is not used for IPv6.

MULTI_EXIT_DISC The MULTI_EXIT_DISC attribute is called the Multi-Exit Discriminator and is used by a BGP speaker to set the metric values on multiple paths that enter into the local AS, thereby informing an EBGP peer about the optimal entry point for inbound traffic. The smaller the metric value is, the more preferred the path is. Since the MULTI_EXIT_DISC attribute is an optional non-transitive attribute, a receiving BGP speaker must not propagate this attribute to other peers.

LOCAL_PREF The LOCAL_PREF attribute is called the Local Preference and is used by IBGP peers to convey path preference within an AS, thereby informing IBGP peers about the optimal exit point for outbound traffic. The higher the value is, the more preferred the path is. The LOCAL_PREF attribute is a well-known attribute. The LOCAL_PREF attribute is used by the administrator to specify the optimal AS exit point. A BGP speaker must not include the LOCAL_PREF attribute in UPDATE messages that are sent to EBGP peers.

LOCAL_PREF was previously categorized as discretionary, but [RFC4271] has removed the discretionary categorization from this attribute. The reason is not clear, but we speculate perhaps the change made was because the requirement level of this attribute for IBGP is different from that for EBGP.

FIGURE 1-10

FIGURE 1-10

ATOMIC_AGGREGATOR A BGP speaker may aggregate multiple prefixes into a single prefix and advertise that prefix to its peers. In this case the BGP speaker that performed the aggregation would include the ATOMIC_AGGREGATOR attribute to indicate to its peer that the less specific prefix is being advertised. The ATOMIC_AGGREGATOR attribute is a well-known discretionary attribute.

AGGREGATOR The AGGREGATOR attribute contains the BGP identifier and the IPv4 address of the BGP speaker that performed the route aggregation. This attribute is an optional transitive attribute.

COMMUNITY The usage of the COMMUNITY attribute is outside the scope of this topic. Therefore its description is omitted here.

The MP_REACH_NLRI and the MP_UNREACH_NLRI attributes are described in the next section.

IPv6 Extensions for BGP4+

A BGP4+ speaker that understands IPv6 must indicate it supports the multiprotocol extensions for BGP4+ by setting the necessary capability in the OPEN message. The fields of the capability parameter are shown in Figure 1-23.

The capability code is set to 1 for multiprotocol extensions. The Address Family Identifier (AFI) field is set to 2, which is the address family number assigned by IANA for IPv6. The Subsequent Address Family Identifier (SAFI) provides additional information about the NLRI carried in the multiprotocol NLRI attributes. [RFC2858] defines the following values:

1 NLRI used for unicast forwarding. This value is used for IPv6 unicast routing.

2 NLRI used for multicast reverse path forwarding calculation.

3 NLRI used for both unicast and multicast.1

An UPDATE message that carries only IPv6 routes will set the Withdrawn Routes Length field to 0, and the NLRI field would not be present. Since the NEXT_HOP attribute is an IPv4-specific attribute, it is omitted in UPDATE messages that carry IPv6 NLRI.

[RFC2858] indicates the NEXT_HOP attribute may be omitted, but in practice this attribute is normally set to 0.0.0.0 and is included in UPDATE messages that carry IPv6 NLRI. The main reason was due to the observation made that a specific BGP implementation rejected BGP messages due to the lack of a well-known mandatory attribute.

Advertising IPv6 Routes

BGP4+ uses the MP_REACH_NLRI path attribute to advertise IPv6 routes. The MP_REACH_NLRI attribute is an optional non-transitive attribute. Figure 1-24 shows the format of this attribute.

Address Family Identifier The two-byte AFI field is set to value 2 for IPv6.

Subsequent AFI The 1-byte SAFI field is set according to the value defined in [RFC2858].

Length of Next Hop Address This 1-byte field specifies the size of the next hop address. Typically the next hop address carries only the global IPv6 address of the next hop router. In this case the length field is set to 16. The link-local address may be included as the additional next hop address if the advertising BGP speaker shares a common link with the next hop and the peer to which the route is advertised. In this case the length field is set to 32.

Next Hop Address This field contains the global IPv6 address of the next hop router. Depending on the value of the next hop address length field, the link-local address of the router may be included in addition to the global IPv6 address.

FIGURE 1-24

FIGURE 1-24

Number of SNPAs This 1-byte field specifies the number of Subnetwork Points of Attachment (SNPA) that are present in the attribute. This field is set to 0 for IPv6, which means the SNPA field is omitted.

NLRI The NLRI lists the routes that are advertised by this attribute. For IPv6 the NLRI is encoded in the <length, prefix> format.

Length This 1-byte field specifies the size of the prefix in bits that immediately follows this field. 0 is a special value that indicates the prefix matches all IP addresses, that is, the prefix has all zero values in every byte.

Prefix This field contains the reachable prefix. The Prefix field is variable in size. This field may be padded such that the prefix is aligned on the byte boundary.

Withdraw IPv6 Routes

BGP4+ uses the MP_UNREACH_NLRI attribute to withdraw IPv6 routes. The MP_UNREACH_NLRI attribute is an optional non-transitive attribute. Figure 1-25 shows the format of this attribute.

Address Family Identifier The two-byte AFI field is set to value 2 for IPv6.

Subsequent AFI The 1-byte SAFI field is set according to the value defined in [RFC2858].

Withdrawn Routes This field contains the list of prefixes to be removed from the routing table. For IPv6 the withdrawn routes are encoded in the <length, prefix> format.

Length This 1-byte field specifies the size of the prefix in bits that immediately follows this field. 0 is a special value indicating that the prefix matches all IP addresses, that is, the prefix has all zero values in every byte.

Prefix This field contains the prefix to be withdrawn. The Prefix field is variable in size. This field may be padded such that the prefix is aligned on the byte boundary.

BGP4+ Route Selection Process

BGP path selection takes place when a BGP router receives an UPDATE message from its peer. The BGP4+ route selection process needs to take into account the path segment that is internal within the AS, and the path segment that is external to the AS.

FIGURE 1-25 

FIGURE 1-25

Typically the policies that apply to route selections are different for the two path segments, which are reflected in the settings of the various attributes such as the LOCAL_PREF and the MULTI_EXIT_DISC attributes.

The BGP path selection algorithm is called the best path selection algorithm because route selection is based on degree of preference. The path selection algorithm is composed of two phases. In the first phase the preference of each route is determined. In the second phase, all feasible routes are considered and the route with the highest preference is chosen as the best route. Tie-breaker rules are executed to select a single entry when multiple routes have the same preference. A route is considered feasible if the NEXT_HOP attribute is resolvable and the AS_PATH attribute does not contain the receiver’s ASN.

Computing Route Preference

The preference of a route is determined by the LOCAL_PREF attribute if the UPDATE message is received from an internal peer. It is also allowable to calculate the preference of a route based on locally configured policy even when an IBGP peer originated the UPDATE message. In this case, however, the preference calculated may result in the route to be selected as the best route, which may cause a routing loop subsequently.

The locally configured policy is used to calculate route preference when the UPDATE message is received from an EBGP peer. The resulting preference may be redistributed to IBGP peers in the LOCAL_PREF attribute if the received route is deemed eligible.

Route Selection

The BGP4+ route selection algorithm chooses the route with the highest degree of preference among all possible paths to the same prefix.

A route is chosen as the best path if that route is the only route to a given prefix. Since the LOCAL_PREF is used instead of the preconfigured policy when computing degree of preference, routes with the highest LOCAL_PREF value are preferred.

When multiple routes to the same prefix have the same degree of preference, the following rules serve as the tie-breakers to select a single route:

• The route with the shortest AS_PATH is preferred.

• The route with the lowest ORIGIN code is preferred. In other words, routes that are originated from IGP are preferred over routes that are originated from EGP.

• The route with the lowest value MULTI_EXIT_DISC is preferred. The comparison of the MULTI_EXIT_DISC applies to routes that are learned from the same AS. In this context, the route without the MULTI_EXIT_DISC attribute is preferred over the one with the MULTI_EXIT_DISC attribute attached.

• The route advertised by an EBGP peer is preferred over the same route that is advertised by an IBGP peer.

• The route with the smallest interior cost (or metric) to the next hop router, which is specified by the Next Hop Address field of the MP_REACH_NLRI attribute for IPv6, is preferred.

• The route that was advertised by the BGP router having the lowest identifier is preferred.

• The route that was advertised by the BGP router with the lowest address is preferred.

Next post:

Previous post: