Java Reference
In-Depth Information
are non-zero. Different values for the relative sizes of data values, pointers, or
matrix indices can lead to a different break-even point for the two implementations.
The time required to process a sparse matrix should ideally depend on NNZ.
When searching for an element, the cost is the number of elements preceding the
desired element on its row or column list. The cost for operations such as adding
two matrices should be (n + m) in the worst case when the one matrix stores n
non-zero elements and the other stores m non-zero elements.
Another representation for sparse matrices is sometimes called the Yale rep-
resentation. Matlab uses a similar representation, with a primary difference being
that the Matlab representation uses column-major order. 1 The Matlab representa-
tion stores the sparse matrix using three lists. The first is simply all of the non-zero
element values, in column-major order. The second list stores the start position
within the first list for each column. The third list stores the row positions for each
of the corresponding non-zero values.
In the Yale representation, the matrix of
Figure 12.7 would appear as:
Values:
10 45 40 23 5 32 93 12 19 7
Column starts:
0 3 5 5 7 7 7 7
Row positions:
0 1 4 0 1 7 1 7 0 7
If the matrix has c columns, then the total space required will be proportional to
c + 2NNZ. This is good in terms of space. It allows fairly quick access to any
column, and allows for easy processing of the non-zero values along a column.
However, it does not do a good job of providing access to the values along a row,
and is terrible when values need to be added or removed from the representation.
Fortunately, when doing computations such as adding or multiplying two sparse
matrices, the processing of the input matrices and construction of the output matrix
can be done reasonably efficiently.
12.3
Memory Management
Most data structures are designed to store and access objects of uniform size. A
typical example would be an integer stored in a list or a queue. Some applications
require the ability to store variable-length records, such as a string of arbitrary
length. One solution is to store in the list or queue fixed-length pointers to the
variable-length strings. This is fine for data structures stored in main memory.
But if the collection of strings is meant to be stored on disk, then we might need
to worry about where exactly these strings are stored. And even when stored in
main memory, something has to figure out where there are available bytes to hold
the string. We could easily store variable-size records in a queue or stack, where
1 Scientific packages tend to prefer column-oriented representations for matrices since this the
dominant access need for the operations to be performed.
Search WWH ::




Custom Search