Time-Triggered Communication - Networked Embedded Systems

Image Processing Reference

In-Depth Information

In a time-triggered communication protocol, the error containment mechanisms for timing

message failures can be enforced transparently to the application. Using the a priori knowledge con-

cerning the global points in time of all intended message sent and received instants, autonomous

guardians can block timing message failures. For this purpose, node-local and centralized guardians

have been developed and validated for different time-triggered communication protocols (e.g.,

[BKS,Gua]). For example, in TTP, a guardian transforms a message which it judges untimely

into a syntactically incorrect message by cutting off its tail [Kop].

Error containment for value message failures is generally not part of a time-triggered communica-

tion protocol, but within the responsibility of the host computers. For example, value failure detection

and correction can be performed using N-modular redundancy (NMR). N replicas receive the same

requestsandprovidethesameservice.heoutputofallreplicasisprovidedtoavotingmechanism,

which selects one of the results (e.g., based on majority) or transforms the results to a single one

(average voter). The most frequently used N-modular configuration is triple-modular redundancy

(TMR). By employing three components and a voter, a single consistent value failure in one of the

constituting components can be tolerated.

Although not natively provided by time-triggered communication protocols, NMR is enabled

by time-triggered communication protocols by supporting replica determinism [Pol]. Fault-free

replicated components exhibit replica determinism, if they deliver identical outputs in an identical

order within a specified time interval. Replica determinism simplifies the implementation of fault-

tolerance by active redundancy, as failures of components can be detected by carrying out a bit-by-bit

comparison of the results of replicas. Replica nondeterminism is introduced either by the interface

to the real world or the system's internal behavior.

14.4.4 Diagnostic Services

Diagnostic services are concerned with the identification of failed subsystems. Diagnostic services

can trigger the autonomous recovery of a system in case of a transient subsystem failure. In addition,

diagnostic services can support the replacement of defective subsystems if a failure is permanent.

An example of a diagnostic services that can be found in time-triggered communication protocols

is a solution to the membership problem. The membership problem is a fundamental problem in

distributed computing, because it allows solutions to other important problems in designing fault-

tolerant systems [GP]. he membership problem is defined as the problem of achieving agreement

on the identity of all correctly functioning processes of a process group. A process is correct, if its

behavior complies with the specification. Otherwise the process is denoted as faulty.

In the context of integrated architecture, it makes sense to establish membership information for

FCRs, as FCRs can be expected to fail independently. Depending on the assumed types of faults, an

FCR is either an entire system component or a subsystem within a component (e.g., a task) dedicated

to a function.

A service that implements an algorithm for solving the membership problem and offers consistent

membership information is called a membership service. A membership service simplifies the pro-

vision of many application algorithms, as the architecture offers generic error detection capabilities

via this service. Applications can rely on the consistency of the membership information and react

to detected failures of FCRs as indicated by the membership service.

A membership service also plays an important role for controlling application level fault-tolerance

mechanisms that deal with failures of functions. If a function fails—as more FCRs have failed than

can be tolerated by the given amount of redundancy—all that an integrated architecture can do is

to inform other functions about this condition so they can react accordingly by application level

fault-tolerance mechanisms.

In a time-triggered communication system, the periodic message send times are membership

points of the sender [KGR]. Every receiver knows a priori when a message of a sender is supposed

Networked Embedded Systems

Search WWH ::

Custom Search

Home