IEEE 1588 Precision Time Protocol (Precision System Clock Architecture) (Computer Network Time Synchronization)

The IEEE 1588 PTP is designed to synchronize real-time clocks in LANs used for telecommunications, industrial automation, and test and measurement applications. It is most commonly used in Ethernet LANs supporting multicast communications but can be used in other network technologies as well. Typical accuracies achieved on a high-speed, multiple-segment LAN are within 100 ns and in some cases much less. Version 1 of the PTP was published in 2002 [1], but it is not described in this topic. Version 2, published in 2008 [2], is the topic of this section. A number of related publications, including Garner [3] and Subrahmanyan [4], as well as a topic by John C. Eidson [5], have been published about the protocol and its applications. This section contains an overview of PTP and a comparison with NTP.

Timestamp Capture

A 1588 clock is an oscillator, usually a temperature-compensated crystal oscillator (TCXO), and a counter that represents time in seconds and nanoseconds since 0 h 1 January 1970. The intended timescale is International Atomic Time (TAI) with provisions for the UTC offset and advance notice of leap seconds. The time representation is similar to POSIX, except the PTP seconds field has 48 bits, making the timestamp 10 octets long. A PTP timestamp is latched from the counter when an Ethernet SOF octet is detected on the wire. This requires an NIC that can provide these timestamps to upper protocol layers.

Figure 15.5 shows the block diagram for a typical 1588 NIC. It includes a media access (MAC) layer, which contains the FIFO and frame buffers together with an embedded state machine that assembles a frame, including the Ethernet header, user data, and FCS. The frame is then sent in four-bit nibbles over the media independent interface (MII) bus to the physical (PHY) layer, where it is coded for transmission over the wire. To support a timestamping function, the MII nibbles are passed through an FPGA, where the necessary 1588 operations are performed without affecting other MAC or PHY operations. The purpose of the FPGA is to latch the 1588 clock when a SOF octet shows up,.


For 10-Mb/s Ethernets, the PHY inserts the Ethernet preamble, then the rest of the frame in Manchester encoding. For 100-Mb/s Ethernets, the PHY first encodes an MII nibble to a five-bit symbol, where each symbol represents either one of the 16 nibble bit combinations or additional special symbols for idle sequence and frame delimiting. The resulting symbol stream is first processed by a scrambler, then encoded in a multilevel transport (MLT-3) stream. The reason for these steps is to reduce RFI by spreading and lowering the signal spectrum, but these are otherwise not important here.

At the present state of the art, it is possible that the entire PTP stack can be implemented onboard the NIC. This is possible using the Intel IXP465 network processor [6] chip, which contains an embedded RISC microprocessor and encryption engine. The PTP specification includes an addendum that defines the Ethernet-specific encapsulation assigned to PTP. Used in this way, PTP operations can be completely transparent to other protocols using the same NIC.

In many cases, some PTP functions are offloaded from the NIC to an associated driver or application program and provisions made to discipline the 1588 clock and retrieve timestamps. This involves encapsulating the PTP header in some other protocol, typically Internet Protocol/User Datagram Protocol (IP/UDP).

1588 NIC architecture.

Figure 15.5

1588 NIC architecture.

The problem remains how to retrieve the transmit and receive timestamps for use by the higher-level protocol. PTP event messages contain a single timestamp field that can be overwritten by the FPGA with a PTP timestamp. If this is done, provisions must be made either to ignore the FCS and UDP checksums or to recalculate them. The PTP specification includes an addendum that defines the UDP-specific encapsulation assigned to PTP.

PTP Clock Architecture

Figure 15.6 shows the block diagram of a PTP ordinary clock (OC) containing a 1588 clock, a discipline loop, and a protocol state machine. It has a universally unique clock identifier (UUID) assigned much the same way as Ethernet MAC addresses. In addition, the OC has one or more physical ports that connect to different LAN segments. Each physical port has two logical ports, one used for timestamped event messages and the other for general messages. The concatenation of UUIC and port number is called the port identifier (portID).

A PTP subnet consists of a number of OCs operating on one or more LAN segments interconnected by bridges. More than one subnet can be deployed on the same network, each distinguished by a unique domain number. One or more OCs acting as grandmaster (GM) clocks provide timing to other OCs operating as masters or slaves. One or more OCs operating as masters provide timing to other OCs operating as masters or slaves.

A PTP bridge operates in either of two modes for PTP messages. For all other messages, it operates as an ordinary bridge. A PTP bridge operating as a transparent clock (TC) calculates a correction due to the ingress PTP message transmission time and the residence time in the bridge itself. It then adds this value to the corrections field in the egress message header. This is a tricky maneuver as it requires correcting or recalculating whatever checksums are involved.

PTP ordinary clock.

FIGURE 15.6

PTP ordinary clock.

Typical PTP subnet.

Figure 15.7

Typical PTP subnet.

The LAN segments joined by a TC are considered the same logical LAN segment. A bridge operating as a boundary clock (BC) operates as a slave for an upstream master on one segment and as a master for downstream slaves on all other segments. In particular, it does not repeat PTP broadcast messages from one segment to another. In a very real sense, a BC is like an NTP secondary server.

Figure 15.7 shows a typical PTP subnet including a GM, four TCs, seven OCs, and a BC. The best master clock (BMC) algorithm discussed in Section 15.4.4 constructs a spanning tree designated by the arrows, which show the flow of synchronization from the GM master port M to the slave ports S and from the BC master port M to the slave ports S. The number following the GM, BC, and OC designator is the stratum, called the steps removed in the specification, constructed by the BMC algorithm.

PTP Messages

It will be helpful in the following discussion to briefly describe the message types and formats used in PTP. There are two message classes: event messages and general messages. Event messages and most general messages include a single timestamp field beyond the header. Event messages use this field for the receive timestamp provided by the 1588 NIC; general messages use this field to return requested timestamps.

Table 15.1 shows the PTP message types. Those with type codes less than 8 are event messages; those with other codes are general messages. Only the Sync, Follow_Up, Delay_Req, Delay_Resp, and Announce messages are used in the basic protocol described here. The Sync, Follow_Up, and Announce messages are sent using broadcast means by a master to all slaves on the LAN segment. The other messages are sent using unicast means. The remaining message types are used for special cases and management functions beyond the scope of this discussion.

All PTP messages include a common header shown in Table 15.2. All the messages discussed here have only a single 10-octet timestamp, except the Announce message, which has the payload shown in Table 15.3. Not all the fields are relevant to this discussion.

Table 15.1

PTP Message Types

Type

Name

Use

0

Sync

Master broadcasts timestamp T2

1

Delay_Req

Slave requests timestamp T3

2

Path_Delay_Req

Not applicable

3

Path_Delay_Resp

Not applicable

8

Follow_Up

Master broadcasts timestamp T1

9

Delay_Resp

Slave receives timestamp T4

A

Pedaled

Not applicable

B

Announce

Topology management

C

Signalling

Utility

D

Management

Network management

Table 15.2

Message Header Format

Length

Offset

Name

Use

0

1

Type

Message type

1

1

Version

PTP version (2)

2

2

Length

Message length

1

4

Domain

PTP domain

1

5

Reserved

Not used

2

6

Flags

Protocol flag bits

8

8

Correction

Timestamp correction

4

16

Reserved

Not used

10

20

SourcePortID

Port ID of sender

2

30

SequenceID

Message sequence number

1

32

Control

Used for version compatibility

1

33

MessageInterval

log2 of message interval

Best Master Clock Algorithm

An important feature of PTP is its ability to generate an acyclic spanning tree that determines the timing flow from a GM via BCs to the slaves. This is done by the BMC algorithm using a transitive ">" relation determined from the elements of the data sets maintained for each clock and each port of the clock. The > relation defines a partial ordering of the ports, which in turn determines the spanning tree and establishes the state of each port. The composition of the > relation is discussed in the next section.

Assume a particular clock C0 has a data set D0 and has N ports. The object of the algorithm is to assign to each port one of three states: MASTER, SLAVE, or PASSIVE. A port in MASTER state broadcasts periodic Announce messages, including data set D0, to all other ports sharing the same LAN segment. Initially, all ports are in MASTER state. As the result of the BMC algorithm, all but one of them will become slaves.

Table 15.3

Announce Message Format

Length

Offset

Name

Use

34

0

Header

Message header

10

34

UTCOffset

Current UTC offset from Posix

2

44

TimescaleInUse

TAI or arbitrary

4

46

AnnounceFlags

Protocol flag bits

2

50

StepsRemoved

Stratum

10

52

GMPortID

GM UUID [7], port number [5]

4

62

GMClockQuality

Class [9], source [9], variance [5]

1

66

GMPriority1

Arbitrary

1

67

GMPriority2

Arbitrary

10

68

ParentPortID

Master port ID

4

78

LocalClockQuality

Class [9], source [9], variance [5]

4

82

ClockChangeRate

Master change rate

2

86

ParentVariance

Master variance

Announce messages arriving at a port in any state are collected and the latest one from each different clock saved on a list. At periodic intervals, clock C0 saves the best data set D{ from the ith port, then selects the best data set DB from among the D{ data sets. Here, "best" is determined by pairwise data set comparisons using the > relation.

The BMC algorithm then uses the data sets D0, D, and DB to determine the state of each port. This depends on the class of C0 that determines whether the clock is a GM or not. If C0 is a GM and D0 > Di, the state of port i is MASTER; otherwise, it is PASSIVE. If not a GM, the following rules apply to all ports of all clocks:

1. If D0 > D, the state of port i is MASTER; otherwise,

2. If Di = Db, the state of port i is SLAVE; otherwise,

3. The state of port i is PASSIVE.

Data Set Comparison Algorithm

The composition of the > relation is developed from the specification that includes an intricate and sometimes confusing set of flow diagrams. Consider two data sets to be compared. We wish to select the highest (or lowest) value of a data set member as data set A and the other as B, then A > B. However, if the values are identical, the comparison falls through to the next step. The data set member is determined from the fields of the most recent Announce message received (see Table 15.3).

1. Select the highest GMpriority1.

2. Select the lowest class in GMClockQuality.

3. If A and B have the same GM UUID, go to step 8.

4. Select the best source in GMClockQuality (GPS, NTP, etc.).

5. If the variances differ significantly, select the lowest variance in GMClockQuality.

6. Select the highest GMpriority2.

7. Select the lowest GMPortID, which is guaranteed to be unique. The data sets have the same GM UUID.

8. If stratum is not equal, select the lowest GMStepsRemoved (stratum). The GM stratum of A is the same as B. There is a potential loop.

9. If the receiving port UUID of B is less than sending port UUID of B, select A and set the state of B to PASSIVE. If the receiving port UUID of A is less than sending port UUID of A, select B and set the state of A to PASSIVE.

PTP Time Transfer

The key to understanding how PTP works is this observation: For event messages (only), the timestamp field in the message is overwritten by the receive timestamp on arrival, and the transmit timestamp is available only in the timestamp field of the next message sent. Note that for a broadcast message sent to many recipients, the receive timestamp may be different.

Figure 15.8 shows the two-step variant normally used by the protocol. The Sync event message and Follow_Up general message are sent from master ports using broadcast means to slave ports on the LAN segment. These messages are sent frequently, typically at intervals of 2 s. On the other hand, the Delay_Req event message and Delay_Resp general messages are sent using unicast means. These message are sent much less frequently as they must be sent for every slave on the LAN segment.

PTP protocol operations.

FIGURE 15.8

PTP protocol operations.

The protocol requires four messages to determine four timestamps T1, T2, T3, and T4, as shown in Figure 15.8. On receipt of the Sync message, the timestamp field contains the receive timestamp (T2). The master saves the transmit timestamp (T1) of the Sync message and sends it in the time-stamp field of the Follow_up message. When the highest accuracy is not required, the quantity T1 – T2 represents the offset of the master relative to the slave.

In the one-step protocol variant, the Sync message timestamp field is overwritten by the transmit timestamp as the message is sent. When the message is received, this field along with the receive timestamp are passed to upper protocol layers for processing, and the Follow_Up message is not used. While this is efficient, it requires that the receive timestamp be saved somewhere in the receive buffer structure without affecting the message itself.

The Delay_Req and Delay_Resp messages are ordinarily interpreted to measure the delay from the master port to each individual slave port, but there is a simpler way to interpret the measurements consistent with the NTP on-wire protocol. At relatively infrequent intervals, the slave sends a Delay_Req event message to the master and saves the transmit timestamp (T3) for use later. The master immediately replies with a Delay_Resp general message, including the receive timestamp (T4) in the timestamp field. Finally, the slave calculates its offset and delay relative to the master as in Equations 15.1 and 15.2 but notes that the sign of the offset is inverted relative to the usual NTP conventions.

PTP and NTP Compared

In comparing PTP and NTP, we start with the observation that NTP is engineered to synchronize computer clocks in an extended network, while PTP is engineered to synchronize device clocks in a segmented LAN. Device clocks in telecommunications, test, and measurement equipment use relatively high-quality oscillators disciplined directly by PTP since in general there is no operating system with competing programs, buffers, and queues that might get in the way. Computer clocks use commodity-quality oscillators disciplined indirectly over a network in which application programs and packets can get in each other’s way.

Setting aside the accuracy issue for the moment, PTP and NTP have many aspects in common. In both protocols, synchronization flows from the primary servers (GMs) via secondary servers (BCs) to clients (OCs). An embedded algorithm constructs a shortest-path spanning tree, although with quite different metrics. Both NTP and PTP assume each member clock is disciplined by some means in time and frequency, and both have means to detect faults and estimate timekeeping quality. Both can measure clock offset and round-trip delay, but neither can detect nonreciprocal delays. Both can operate in broadcast or unicast mode, and both use the same offset and delay calculations, although expressed in different ways. While not discussed here, both have similar authentication provisions based on cryptographic message digests.

One difference between NTP and PTP is expressed in the NTP operating modes, which can be client/server (master-slave in PTP), broadcast/multicast (multicast in PTP), and symmetric modes that have no equivalent in PTP. This is reflected in the NTP service model, in which either of two peers can back up the other should one of them lose all synchronization sources. This can also be the case in PTP with multiple GMs, but the subnet has to reconfigure in case of failure.

The most striking difference between PTP and NTP, at least in some subnet configurations, is that PTP has the intrinsic capability to construct the spanning tree with no specific prior configuration, while in NTP the servers or peers are explicitly configured. However, this is not always the case. An NTP Ethernet including broadcast servers and clients will automatically self-assemble as in PTP with prior configuration only to provide the subnet broadcast address, and this would be necessary in PTP as well. An NTP subnet using manycast mode, in which each NTP segment operates with a distinct group address, would also self-assemble as in PTP.

This applies to both NTP and PTP; however, the fact that PTP uses timestamps captured at the NIC means that the phase noise is much less than with NTP. On the other hand, NTP uses softstamps and drivestamps, which are vulnerable to buffering and queuing latencies in the operating system and network. A fundamental distinction between NTP and PTP is that NTP is normally utilized where relatively long update intervals are required to minimize network load. On the other hand, PTP is normally utilized on high-speed LANs with no such requirement and operates with update intervals on the order of 2 s. On a LAN with reduced phase noise and shorter update intervals, PTP can provide far better performance than NTP, even if using the same commodity oscillator.

An important feature of NTP is the suite of algorithms that groom and mitigate between and among a number of servers when more than one is available. PTP has no need for these algorithms as only the best GM is included in the spanning tree. In NTP, the best source is represented by the system peer at each stratum level. A significant difference between the NTP and PTP specifications is that in NTP the secondary server discipline algorithm must have a defined transient response to control overshoot and avoid instability in long timing chains. The PTP specification leaves this as an implementation choice. This would be most important in the BC discipline loop.

Finally, there is the issue of how many timestamps can be included in the packet header. With PTP, there is only one, which constrains the flexibility in the protocol design. With NTP, there are three operative timestamps that support all NTP modes, including the interleaved modes described in the next topic. This provides not only flexibility but also strong duplicate and replay protection without needing serial numbers.

There are a number of scenarios in which NTP and PTP can be exploited in a combined system. In principle, a software implementation of PTP with softstamps or drivestamps is possible and should perform as well as NTP, at least without the grooming and mitigation algorithms. Perhaps the most useful scenario would be a server that runs NTP with one or more PTP NICs operating as masters or slaves on the PTP subnet. The PTP NIC could launch NTP packets while running PTP. Some NICs, in particular the Network Instruments NI1588, provide a PPS signal synchronized to the 1588 clock via a connector on the NIC chassis. This signal could be used by the PPS driver included in the NTP software distribution. This is probably the easiest, quickest way to import PTP time to NTP and the operating system.

Importing and exporting PTP time to and from the computer itself is straightforward. It is possible that application programs could read the 1588 clock using an I/O command, but probably it is not a good idea as the overhead to perform I/O instructions is much higher than to read the system clock, which involves queuing, buffering, and scheduling latencies. It would be more efficient to read the 1588 clock infrequently, like once per minute, and treat it as an NTP reference clock. In this design, the computer clock itself is used as a flywheel between 1588 clock updates.

It does not seem practical to translate PTP performance variables to the NTP error budget, other than the various variance statistics carried in the PTP Announce message. If the computer system clock is to be used for the GM function, the 1588 clock adjustment function could be used in a feedback loop much the way the computer clock itself is disciplined by NTP. Some way is needed to pass on components of the NTP error budget that might be useful for PTP statistics or that might be explored in the future, as is the role of management and signaling functions.

Next post:

Previous post: