Lists and Arrays Revisited - Data Structures and Algorithm Analysis

Java Reference

In-Depth Information

are non-zero. Different values for the relative sizes of data values, pointers, or

matrix indices can lead to a different break-even point for the two implementations.

The time required to process a sparse matrix should ideally depend on NNZ.

When searching for an element, the cost is the number of elements preceding the

desired element on its row or column list. The cost for operations such as adding

two matrices should be (n + m) in the worst case when the one matrix stores n

non-zero elements and the other stores m non-zero elements.

Another representation for sparse matrices is sometimes called the Yale rep-

resentation. Matlab uses a similar representation, with a primary difference being

that the Matlab representation uses column-major order. 1 The Matlab representa-

tion stores the sparse matrix using three lists. The first is simply all of the non-zero

element values, in column-major order. The second list stores the start position

within the first list for each column. The third list stores the row positions for each

of the corresponding non-zero values.

In the Yale representation, the matrix of

Figure 12.7 would appear as:

Values:

10 45 40 23 5 32 93 12 19 7

Column starts:

0 3 5 5 7 7 7 7

Row positions:

0 1 4 0 1 7 1 7 0 7

If the matrix has c columns, then the total space required will be proportional to

c + 2NNZ. This is good in terms of space. It allows fairly quick access to any

column, and allows for easy processing of the non-zero values along a column.

However, it does not do a good job of providing access to the values along a row,

and is terrible when values need to be added or removed from the representation.

Fortunately, when doing computations such as adding or multiplying two sparse

matrices, the processing of the input matrices and construction of the output matrix

can be done reasonably efficiently.

12.3

Memory Management

Most data structures are designed to store and access objects of uniform size. A

typical example would be an integer stored in a list or a queue. Some applications

require the ability to store variable-length records, such as a string of arbitrary

length. One solution is to store in the list or queue fixed-length pointers to the

variable-length strings. This is fine for data structures stored in main memory.

But if the collection of strings is meant to be stored on disk, then we might need

to worry about where exactly these strings are stored. And even when stored in

main memory, something has to figure out where there are available bytes to hold

the string. We could easily store variable-size records in a queue or stack, where

1 Scientific packages tend to prefer column-oriented representations for matrices since this the

dominant access need for the operations to be performed.

Data Structures and Algorithm Analysis

Search WWH ::

Custom Search

Home