Organizing Data with References, Types, and Dimension Scales - Python and HDF5

Databases Reference

In-Depth Information

But if we rename the dataset, this quickly breaks:

>>> f . move ( 'mydata' , 'mydata2' )

>>> out = f [ grp1 . attrs [ 'dataset' ]]

KeyError: "unable to open object"

Using object references instead, we have:

>>> grp1 . attrs [ 'dataset' ] = dset . ref

>>> grp1 . attrs [ 'dataset' ]

>>> out = f [ grp1 . attrs [ 'dataset' ]]

>>> out == dset

True

Moving the dataset yet again, the reference still resolves:

>>> f . move ( 'mydata2' , 'mydata3' )

>>> out = f [ grp1 . attrs [ 'dataset' ]]

>>> out == dset

True

When you open an object by dereferencing, every now and then it's

possible that HDF5 won't be able to figure out the object's name. In

that case, obj.name will return None . It's less of a problem than it used

to be (HDF5 1.8 has gotten very good at figuring out names), but don't

be alarmed if you happen to get None .

References as Data

References are full-fledged types in HDF5; you can freely use them in both attributes

and datasets. Obviously there's no native type in NumPy for references, so we once again

call on special_dtype for help, this time with the ref keyword:

>>> dt = h5py . special_dtype ( ref = h5py . Reference )

>>> dt

dtype(('|O4', [(({'type': <type 'h5py.h5r.Reference'>}, 'ref'), '|O4')]))

That's a lot of metadata. But don't be dismayed; just like variable-length strings, this is

otherwise a regular object dtype:

>>> dt . kind

'O'

We can easily create datasets of Reference type:

>>> ref_dset = f . create_dataset ( "references" , ( 10 ,), dtype = dt )

What's in such a dataset? If we retrieve an uninitialized element, we get a zeroed or “null”

reference:

Search WWH ::

Custom Search

Home