Organizing Data with References, Types, and Dimension Scales - Python and HDF5

Databases Reference

In-Depth Information

[False, False, True],

[ True, True, False]], dtype=bool)

You can create a region reference from this array:

>>> random_ref = dset_random . regionref [ index_arr ]

>>> dset_random . regionref . selection ( random_ref )

(4,)

There were a total of four elements that matched, so the selection result is “packed” into

a four-element 1D buffer.

There is a rule to the order in which such elements are retrieved. If we apply our selection

to the dataset:

>>> data = dset_random [ random_ref ]

>>> data

array([ 0.57038087, 0.7758832 , 0.75768745, 0.73156554], dtype=float32)

Looking closely, it appears that the elements retrieved are at [0,2] , [1,2] , [2,0] ,

[2,1] , in that order. You'll recognize this as “C” order; the selection advances through

the last index, then the next to last, and so on.

Unfortunately, list-based selections will also be returned as 1D ar‐

rays, unlike NumPy (try it!). This is a limitation of the HDF5 library.

Finding Datasets with Region References

There's one more trick region references can do. If you have a region reference, say our

shape-(80,) selection ref_out from earlier, you can use it as an object reference to re‐

trieve the dataset:

>>> f [ ref_out ]

This can come in handy when you've stored a region reference as an attribute some‐

where. It means you don't have to also store an object reference to figure out where to

apply the selection.

And if you're just after the data and don't care about the dataset itself, all you have to

do is:

>>> selected_data = f [ ref_out ][ ref_out ]

Named Types

There's one more “linking” concept in HDF5, and it's much more subtle than either

object or region references. We've already seen that when you create a dataset (or an

Search WWH ::

Custom Search

Home