Hardware Reference
In-Depth Information
Timing glitches —Data doesn't arrive at the proper place at the proper time, causing errors.
Often caused by improper settings in the BIOS Setup, by memory that is rated slower than the
system requires, or by overclocked processors and other system components.
Heat buildup —High-speed memory modules run hotter than older modules. RDRAM RIMM
modules were the first memory to include integrated heat spreaders, and many high-performance
DDR, DDR2, and DDR3 memory modules now include heat spreaders to help fight heat
buildup.
Most of these problems don't cause chips to permanently fail (although bad power or static can
damage chips permanently), but they can cause momentary problems with data.
How can you deal with these errors? The best way to deal with this problem is to increase the
system's fault tolerance. This means implementing ways of detecting and possibly correcting errors in
PC systems. Three basic levels and techniques are used for fault tolerance in modern PCs:
Nonparity
Parity
ECC
Nonparity systems have no fault tolerance. The only reason they are used is because they have the
lowest inherent cost. No additional memory is necessary, as is the case with parity or ECC
techniques. Because a parity-type data byte has 9 bits versus 8 for nonparity, memory cost is
approximately 12.5% higher. Also, the nonparity memory controller is simplified because it does not
need the logic gates to calculate parity or ECC check bits. Portable systems that place a premium on
minimizing power might benefit from the reduction in memory power resulting from fewer DRAM
chips. Finally, the memory system data bus is narrower, which reduces the number of data buffers.
The statistical probability of memory failures in a modern office desktop computer is now estimated
at about one error every few months. Errors will be more or less frequent depending on how much
memory you have.
This error rate might be tolerable for low-end systems that are not used for mission-critical
applications. In this case, the extreme market sensitivity to price probably can't justify the extra cost
of parity or ECC memory, and such errors then must be tolerated.
Parity Checking
One standard IBM set for the industry is that the memory chips in a bank of nine each handle 1 bit of
data: 8 bits per character plus 1 extra bit called the parity bit . The parity bit enables memory-control
circuitry to keep tabs on the other 8 bits—a built-in cross-check for the integrity of each byte in the
system.
Originally, all PC systems used parity-checked memory to ensure accuracy. Starting in 1994, most
vendors began shipping systems without parity checking or any other means of detecting or correcting
errors on the fly. These systems used cheaper nonparity memory modules, which saved about 10%-
15% on memory costs for a system.
Parity memory results in increased initial system cost, primarily because of the additional memory
bits involved. Parity can't correct system errors, but because parity can detect errors, it can make the
user aware of memory errors when they happen.
Since then, Intel, AMD, and other manufacturers have put support for ECC memory primarily in
server chipsets and processors. Chipsets and processors for standard desktop or laptop systems
 
 
Search WWH ::




Custom Search