Hardware Reference
In-Depth Information
The other, more insidious type of failure is the soft error, which is a nonpermanent failure that might
never recur or could occur only at infrequent intervals. Soft error rates are known as SERs.
In the late 1970s, Intel made a discovery about soft errors that shook the memory industry. It found
that alpha particles were causing an unacceptably high rate of soft errors or single event upsets
(SEUs, as they are sometimes called) in the 16KB DRAMs that were available at the time. Because
alpha particles are low-energy particles that can be stopped by something as thin and light as a sheet
of paper, it became clear that for alpha particles to cause a DRAM soft error, they would have to be
coming from within the semiconductor material. Testing showed trace elements of thorium and
uranium in the plastic and ceramic chip packaging materials used at the time. This discovery forced
all the memory manufacturers to evaluate their manufacturing processes to produce materials free
from contamination.
Today, memory manufacturers have all but totally eliminated the alpha-particle source of soft errors,
and more recent discoveries prove that alpha particles are now only a small fraction of the cause of
DRAM soft errors.
As it turns out, the biggest cause of soft errors today is cosmic rays. IBM researchers began
investigating the potential of terrestrial cosmic rays in causing soft errors similar to alpha particles.
The difference is that cosmic rays are high-energy particles and can't be stopped by sheets of paper
or other more powerful types of shielding. The leader in this line of investigation was Dr. J.F. Ziegler
of the IBM Watson Research Center in Yorktown Heights, New York. He has produced landmark
research into understanding cosmic rays and their influence on soft errors in memory. One interesting
set of experiments found that cosmic ray-induced soft errors were eliminated when the DRAMs were
moved to an underground vault shielded by more than 50 feet of rock.
Cosmic ray-induced errors are even more of a problem in SRAMs than DRAMS because the amount
of charge required to flip a bit in an SRAM cell is less than is required to flip a DRAM cell
capacitor. Cosmic rays are also more of a problem for higher-density memory. As chip density
increases, it becomes easier for a stray particle to flip a bit. It has been predicted by some that the
soft error rate of a 64MB DRAM is double that of a 16MB chip, and a 256MB DRAM has a rate four
times higher. As memory sizes continue to increase, it's likely that soft error rates will also increase.
Unfortunately, the PC industry has largely failed to recognize this cause of memory errors.
Electrostatic discharge, power surges, and unstable software can much more easily explain away the
random and intermittent nature of a soft error, especially right after a new release of an operating
system (OS) or major application.
Although cosmic rays and other radiation events are perhaps the biggest cause of soft errors, soft
errors can also be caused by the following:
Power glitches or noise on the line —This can be caused by a defective power supply in the
system or by defective power at the outlet.
Incorrect type or speed rating —The memory must be the correct type for the chipset and
match the system access speed.
RF (radio frequency) interference —Caused by radio transmitters in close proximity to the
system, which can generate electrical signals in system wiring and circuits. Keep in mind that
the increased use of wireless networks, keyboards, and mouse devices can lead to a greater risk
of RF interference.
Static discharges —These discharges cause momentary power spikes, which alter data.
 
Search WWH ::




Custom Search