Hardware Reference
In-Depth Information
End-to-End EDMs These mechanisms include end-to-end checksums for message
data and multiple (basically, double) executions of tasks.
The end-to-end checksums are used to detect the mutilation of message data
exchanged between two nodes of an FTU and are therefore used by the receiving
task, for extending the fail silence property of the MARS nodes.
Double execution of tasks in time redundancy can detect errors caused by
transient faults that cause different output data of the two instances of the task.
Combined with the concept of message checksums, task execution in time redun-
dancy forms the highest level in the hierarchy of the error detection mechanisms.
These mechanisms also trigger the execution of a trap instruction, which causes a
reset of the node.
8.3.4.2
The Experimental Framework
The testbed that has supported the fault injection experiments at each site features
five MARS nodes (Fig. 8.16 ) . The node under test (NUT, for short) is the node
subject to the injection of a fault during each experiment run.
Another node ( golden node) serves as a reference and a third node ( comparator
node) is used to compare the messages sent by the two previous nodes. When a
discrepancy is observed by the comparator node (fail silence violation) or the NUT
detects an error, the NUT is declared to be failed and then shut down by the com-
parator node to clear all error conditions for the subsequent experiment run. After
some time, power is reinstalled and the NUT is reloaded for the next run. The data
generation node simulates the data corresponding to the real-time application that
is being used to activate the NUT and the golden node during each fault injection
experiment.
Fig. 8.16
The testbed architecture featuring five MARS nodes
 
Search WWH ::




Custom Search