Related Technology (Computer Network Time Synchronization)

NTP is not the only network timekeeping technology. Other mechanisms have been specified in the Internet protocol suite to record and transmit the time at which an event takes place, including the Daytime protocol [1], Time protocol [2], ICMP (Internet Control Message Protocol) Timestamp message [3], and IP (Internet Protocol) Timestamp option [4]. Other synchronization algorithms are discussed in Cole and Foxcroft [5], the Digital Equipment Corporation [6], Gusella and Zatti [7], Halpern et al. [8], Lundelius and Lynch [9], Marzullo and Owicki [10], Rickert [11], Schneider [12], and Tripathi and Chang [13], while protocols based on them are described in the Digital Time Service Functional Specification Version T.1.0.5 [6], Tripathi and Chang [13], and Gusella and Zatti [14]. Clock synchronization algorithms, not necessarily for time, are discussed in Liao, Martonosi, and Clark [15] and Lu and Zhang [16].

The Daytime and Time protocols are the simplest ways to read the clock of a remote Internet host. In either protocol a client sends an empty message to the server, which then returns the time since 0 h 1 January 1900 as binary seconds (Time) or as a formatted date string (Daytime). The protocol can run above the Transmission Control Protocol (TCP) or the User Datagram Protocol (UDP); however, TCP requires a much larger resource commitment than UDP and provides very little reliability enhancement.

The Digital Time Synchronization Service (DTSS)* [6] has many of the same service objectives as NTP. The DTSS design features configuration management and correctness principles when operated in a managed network environment, while the NTP design features accuracy and stability when operated in the unmanaged Internet environment. In DTSS, a synchronization subnet consists of time providers, couriers, servers, and clerks. A DTSS time provider is synchronized to UTC via a radio or satellite receiver or telephone modem. A courier imports time from one or more distant servers for local redistribution, and a local server provides time for possibly many local clerks. In NTP, the time provider is called a reference clock, while generic NTP servers operate in the roles of DTSS couriers, servers, and clerks depending on the subnet configuration. Unlike NTP, DTSS does not need or use mode or stratum information and does not include provisions to filter, select, cluster, and combine time values or compensate for inherent frequency errors.

The Unix 4.3bsd time daemon timed [7] uses a single master time daemon to measure offsets of a number of slave hosts and send periodic corrections to them. In this model the master is determined using an election algorithm [14] designed to avoid situations in which either no master is elected or more than one master is elected. The election process requires a broadcast capability, which is not a ubiquitous feature of the Internet. While this model has been extended to support hierarchical configurations in which a slave on one network serves as a master on the other [13], the model requires handcrafted configuration tables to establish the hierarchy and avoid loops. In addition to the burdensome, but presumably infrequent, overhead of the election process, the offset measurement/correction process requires twice as many messages as NTP per update.

A scheme with features similar to NTP is described in Kopetz and Ochsenreiter [17]. It is intended for multiserver local-area networks (LANs) in which each of possibly many time servers determines its local time offset relative to each of the other servers in the set. It uses periodic timestamped messages, then determines the local clock correction using the fault-tolerant average (FTA) algorithm of Lundelius and Lynch [9]. The FTA algorithm, which is useful where up to k servers may be faulty, sorts the offsets, discards the k highest and k lowest, and averages the rest. This scheme is most suitable for LAN environments that support broadcast but would result in unacceptable overhead in the general Internet environment. In addition, for reasons given further in this topic, the statistical properties of the FTA algorithm are not likely to be optimal in an Internet environment with highly dispersive delays.

A good deal of research has gone into the issue of maintaining accurate time in a community in which some clocks cannot be trusted. As mentioned in a previous topic, a truechimer is a clock that maintains timekeeping accuracy to a previously published (and trusted) standard, while a falseticker is a clock that does not. Falsetickers can display erroneous or inconsistent times at different times and to different watchers. Determining whether a particular clock is a truechimer or falseticker is an interesting abstract problem.

The fundamental abstraction from which correctness principles are based is the happens before relation introduced by Lamport [18]. Lamport and Melliar-Smith [19] show that clocks are required to determine a reliable time value if no more than m of them are falsetickers, but only clocks are required if digital signatures are available. Byzantine agreement methods are introduced in Pease, Shostak, and Lamport [20] and Srikanth and Toueg [21]. Other methods are based on convergence functions.

A convergence function operates on the offsets between multiple clocks to improve accuracy by reducing or eliminating errors caused by falsetick-ers. There are two classes of convergence functions: those involving interactive-convergence algorithms and those involving interactive-consistency algorithms.

Interactive-consistency algorithms are designed to detect faulty clock processes that might indicate grossly inconsistent offsets in successive readings or to different readers. These algorithms use an agreement protocol involving successive rounds of readings, possibly relayed and possibly augmented by digital signatures. Examples include the fireworks algorithm of Halpern et al. [8] and the optimum algorithm of Srikanth and Toueg [21]. However, these algorithms require large numbers of messages, especially when large numbers of clocks are involved, and are designed to detect faults that have rarely been found in the Internet experience. For these reasons they are not considered further in this topic.

The particular choice of offset and delay computations used in NTP is a variant of the returnable-time algorithm used in some digital telephone networks [23]. The filter and select algorithms are designed so that the clock synchronization subnet self-organizes as a hierarchical master-slave configuration, as in Mitra [24]. The select algorithm is based on the intersection algorithm of Marzullo and Owicki [10], together with a refinement algorithm similar to the self-stabilizing algorithm of Lu and Zhang [25]. What makes the NTP model unique among these schemes is the adaptive configuration, polling, filtering, selection, and discipline mechanisms that tailor the dynamics of the system to fit the ubiquitous Internet environment.

Parting Shots

While incorrect time values due to improperly operating NTP software or protocol design are highly unlikely, hazards remain due to incorrect software external to NTP. These hazards include the Unix kernel and library routines that convert Unix time to and from conventional civil time in seconds, minutes, hours, days, years, and especially centuries. Although NTP uses these routines to format monitoring data displays, they are not used to discipline the system clock. They may in fact cause problems with certain application programs, but this is not an issue that concerns NTP correctness.

It is possible that some external source to which NTP synchronizes may produce a discontinuity that could then induce an NTP discontinuity. The NTP primary servers, which are the ultimate time references for the entire NTP population, obtain time from various sources, including radio and satellite receivers and telephone modems. Not all sources provide year information, and of those that do, not all of them provide the year in four-digit form. In point of fact, the reference implementation does not use the year information, even if available.

It is essential that any synchronization protocol such as NTP include provisions for multiple-server redundancy and multiple-route diversity. Past experience has demonstrated the wisdom of this approach, which protects clients against hardware and software faults as well as incorrectly operating reference clocks and sometimes even buggy software. For the most reliable service, the NTP configuration should include multiple reference clocks for primary servers, such as a backup radio or satellite receiver or telephone modem. Primary servers should run NTP with other primary servers to provide additional redundancy and mutual backup should the reference clocks themselves fail or operate incorrectly.

Related Technology (Computer Network Time Synchronization)

Parting Shots

Related Links

:: Search WWH ::