Information Technology Reference
In-Depth Information
What causes sector or page failures?
For spinning disks, permanent sector failures can be caused by a range of faults such
as pits in the magnetic coating where a contaminant flaked off the surface, scratches
in the coating where a contaminant was dragged across the surface by the head, or
smears of machine oil across some sectors of a disk surface.
Transient sector faults, where a sector's stored data is corrupted but where new data
can be successfully written to and read from the sector, can be caused by factors such
as write interference where writes to one track disturb bits stored on nearby tracks and
“high fly writes” where the disk head gets too far from the surface, producing magnetic
fields too weak to be accurately read.
For flash storage, permanent page failures can be caused by manufacturing defects
or by wear-out when a page experience a large number of write/erase cycles.
Transient flash storage failures can be caused by write disturb errors where charging
one bit also causes a nearby bit to be charged, read disturb errors where repeatedly
reading one page changes values stored on a nearby page, over-programming errors
where too high a voltage is used to write a cell, which may cause incorrect reads or
writes, and data retention error where charge may leak out of or into a flash cell over
time, changing its value; wear-out from repeated write/erase cycles can make devices
more suceptable to data retention errors.
14.2.1
Storage device failures and mitigation
Storage hardware pushes the limits of physics, material sciences, and manufac-
turing processes to maximize storage capacity and performance. These aggres-
sive designs leave little margin for error, so manufacturing defects, contamina-
tion, or wear can cause stored bits to be lost.
Individual spinning disks and flash storage devices exhibit two types of fail-
ure. First, isolated disk sectors or flash pages can lose existing data or degrade
to the point where they cannot store new data. Second, an entire device can
fail, preventing access to all of its sectors or pages. We discuss each of these in
turn to understand the problems higher level techniques need to deal with.
Sector and page failures
Disk sector failures occur when data on one or more individual sectors of a
Denition: sector failure
disk are lost, but the rest of the disk continues to operate correctly. Flash page
failures are the equivalent for flash pages.
Denition: page failure
Storage devices use two techniques to mitigate sector or page failures: error
correcting codes and remapping.
Search WWH ::




Custom Search