Database Reference
In-Depth Information
stored in a cell that is significant to this discussion: a single level cell stores one bit in each cell, which leads to fast
transfer speeds. As an added advantage, SLC last longer than their counterparts. Fast and reliable—there has to be a
downside, and that is cost. The higher cost associated with SLC means that such cells are almost exclusively found in
enterprise class solutions. You also expect high-performance flash memory to be based on Single-Level Cells.
Multi-Level Cells store two bits per cell. Most MLCs, therefore, are slower due to the way the data is accessed.
For the same reason individual MLC cells wear out more quickly than SLCs. However, the MLC-based SSDs allow for
larger capacity. As a rule of thumb, you would buy SLC for performance and MLC for capacity. But let's not forget that
both SLC and MLC are still a lot faster than magnetic storage.
Triple-Level Cells are not really new, but they do not seem to make commercial sense yet in the enterprise
segment. TLC SSDs exist for the consumer market. The advantage of storing three bit per cell is higher capacity, but
similar to the step from SLC to MLC, you get even more wear and slower performance.
Another term often heard in the context of SSD is wear leveling. You read in the previous paragraphs that
individual cells can wear out over the usage time of the device. The wearing of the cell is caused by writes. The
controller managing the device therefore tries to spread the write load over as many cells as possible, completely
transparently. The fewer writes a cell has to endure, the longer it will potentially last.
Multiple cells are organized in pages which in turn are grouped into blocks. Most enterprise-type SSD use
a page size of 4 KB and a block size of 512 KB. These blocks are addressed much like any other block device, i.e.,
hard disk, making 1:1 replacements easy and straightforward. For the same reason you could set the sector size of
the SSD in Linux and other operating systems to 4k. Read and write operations allow you to access random pages.
Erase (delete) operations, however, require a modification of the whole block. In the worst case, if you need to erase
a single page (usually 4 KB), then you have to delete the entire block. The storage controller obviously preserves
the non-affected memory, writing it back together with the modified data. Such an operation is undesirable simply
because it adds latency. Additionally, the write operation adds to the individual cell's wear factor. If possible, instead
of modifying existing cells, the controller will try to write to unused cells, which is a lot faster. Most SSDs, therefore,
reserve a sizeable chunk of space, which is not accessible from the operating system. Delete operations can then
either be completed in the background after the operation has completed or alternatively be deferred until space
pressure arises. The more data is stored on the SSD, the more difficult it is for the controller to find free space. While
performance of SSD, therefore, is generally very good, sometimes you might see certain outliers in write performance.
All of a sudden, some writes have up to 50 milliseconds additional latency. Such outliers are called a write cliff, caused
by the above phenomenon. When getting a SSD on loan, it is important to check how full the device is, a function
which is often available from the driver.
When measuring the performance of SSD with Oracle, it is important to use direct I/O. Using direct I/O allows
Oracle to bypass the file system cache, making performance numbers a lot more realistic. Without direct I/O a request
to read from the storage layer might as well be satisfied from the operating system file system cache. A blazingly
fast millisecond response time in an extended trace file cannot be attributed to a very well-functioning I/O storage
subsystem, for there is no extra information as to where the requested block was found. When you instruct Oracle
to bypass the file system cache, the reported times in the Active Session History or other performance-related
information is more likely to reflect the true nature of the I/O request.
When testing a flash memory solution, you might also want to consider the use of a very small Oracle SGA to ensure
that I/Os generally are not satisfied from the buffer cache. This is easier said than done since Oracle allocates certain
memory areas based, for example, on the number of CPUs as reported in the initialization parameter cpu_count . If you
want to set your buffer cache to 48 MB, which is among the lowest possible values, you probably have to lower your
cpu_count to 2 and use manual SGA management to size the various pools accordingly.
Putting it Together
So far, you have read a lot about different ways to attach storage to your consolidated platform. Why all that detail?
In the author's experience the DBA (or database engineer) is the best person to consult when it comes to rolling out
an Oracle database solution. Why? It's a simple answer: the database administrator knows how storage, client, and
internal facing networking matter to his or her database. The operating system, storage, networking, and every other
solution only serve the purpose to allow the database to execute. You will find congenial storage administrators,
 
Search WWH ::




Custom Search