Memory - Upgrading and Repairing PCs

Hardware Reference

In-Depth Information

• Timing glitches —Data doesn't arrive at the proper place at the proper time, causing errors.

Often caused by improper settings in the BIOS Setup, by memory that is rated slower than the

system requires, or by overclocked processors and other system components.

• Heat buildup —High-speed memory modules run hotter than older modules. RDRAM RIMM

modules were the first memory to include integrated heat spreaders, and many high-performance

DDR, DDR2, and DDR3 memory modules now include heat spreaders to help fight heat

buildup.

Most of these problems don't cause chips to permanently fail (although bad power or static can

damage chips permanently), but they can cause momentary problems with data.

How can you deal with these errors? The best way to deal with this problem is to increase the

system's fault tolerance. This means implementing ways of detecting and possibly correcting errors in

PC systems. Three basic levels and techniques are used for fault tolerance in modern PCs:

Nonparity

Parity

ECC

Nonparity systems have no fault tolerance. The only reason they are used is because they have the

lowest inherent cost. No additional memory is necessary, as is the case with parity or ECC

techniques. Because a parity-type data byte has 9 bits versus 8 for nonparity, memory cost is

approximately 12.5% higher. Also, the nonparity memory controller is simplified because it does not

need the logic gates to calculate parity or ECC check bits. Portable systems that place a premium on

minimizing power might benefit from the reduction in memory power resulting from fewer DRAM

chips. Finally, the memory system data bus is narrower, which reduces the number of data buffers.

The statistical probability of memory failures in a modern office desktop computer is now estimated

at about one error every few months. Errors will be more or less frequent depending on how much

memory you have.

This error rate might be tolerable for low-end systems that are not used for mission-critical

applications. In this case, the extreme market sensitivity to price probably can't justify the extra cost

of parity or ECC memory, and such errors then must be tolerated.

Parity Checking

One standard IBM set for the industry is that the memory chips in a bank of nine each handle 1 bit of

data: 8 bits per character plus 1 extra bit called the parity bit . The parity bit enables memory-control

circuitry to keep tabs on the other 8 bits—a built-in cross-check for the integrity of each byte in the

system.

Originally, all PC systems used parity-checked memory to ensure accuracy. Starting in 1994, most

vendors began shipping systems without parity checking or any other means of detecting or correcting

errors on the fly. These systems used cheaper nonparity memory modules, which saved about 10%-

15% on memory costs for a system.

Parity memory results in increased initial system cost, primarily because of the additional memory

bits involved. Parity can't correct system errors, but because parity can detect errors, it can make the

user aware of memory errors when they happen.

Since then, Intel, AMD, and other manufacturers have put support for ECC memory primarily in

server chipsets and processors. Chipsets and processors for standard desktop or laptop systems

Search WWH ::

Custom Search

Home