Groups, Links, and Iteration: The “H” in HDF5 - Python and HDF5

Databases Reference

In-Depth Information

You're not limited to using paths for source and destination. If you already have an open

Dataset object, for example, you can copy it to a Group or File object:

>>> dset = f [ '/mygroup/apples' ]

>>> f . copy ( dset , f )

>>> f . visit ( printname )

apples

oranges

mygroup

mygroup/dataset

mygroup/subgroup

mygroup2

mygroup2/dataset

mygroup2/subgroup

Since the destination is a group, the dataset is created with its “base name” of apples ,

analagous to how files are moved with the UNIX cp command.

There's no requirement that the source and destination be the same file. This is one of

the advantages of using File or Group objects instead of paths; the corresponding objects

will be copied regardless of which file they reside in. If you're trying to write generic

code, it's good to keep this in mind.

Object Comparison and Hashing

Let's take a break from links and iteration to discuss a more subtle aspect of how HDF5

behaves. In lots of the preceding examples, we used Python's equality operator to see if

two groups are “the same thing”:

>>> f = h5py . File ( 'objectdemo.hdf5' , 'w' )

>>> grpx = f . create_group ( 'x' )

>>> grpy = f . create_group ( 'y' )

>>> grpx == f [ 'x' ]

True

>>> grpx == grpy

False

If we investigate further, we discover that this kind of equality testing is independent of

whether the Python objects are one and the same:

>>> id ( grpx ) # Uniquely identifies the Python object "grpx"

73399280

>>> id ( f [ 'x' ])

73966416

In h5py, equality testing uses the low-level HDF5 facilities to determine which refer‐

ences ( identifiers , in the HDF5 lingo) point to the same groups or datasets on disk. This

information is also used to compute the hash of an object, which means you can safely

use Group , File , and Dataset objects as dictionary keys or as the members of sets:

Search WWH ::

Custom Search

Home