Databases Reference
In-Depth Information
and the space in the file reclaimed. To avoid this we can simply link the group into the
file structure at our leisure:
>>> f [ 'z' ] = grpz
>>> grpz . name
u'/z'
The multiple names issue also affects the behavior of the .parent property. To address
this, in h5py, obj.parent is defined to be the “parent” object according to obj.name .
For example, if obj.name is /foo/bar , obj.parent.name will be /foo .
One way to express this is with the posixpath package built into Python:
>>> import posixpath
>>> parent = obj . file [ posixpath . dirname ( obj . name )]
To remove links, we use the dictionary-style syntax del group[name] :
>>> del f [ 'y' ]
Once all hard links to an object are gone (and the object isn't open somewhere in
Python), it's destroyed:
>>> del f [ 'x' ] # Last hard link; the group is deleted in the file
Free Space and Repacking
When an object (for example, a large dataset) is deleted, the space it occupied on disk
is reused for new objects like groups and datasets. However, at the time of writing, HDF5
does not track such “free space” across file open/close cycles. So if you don't end up
reusing the space by the time you close the file, you may end up with a “hole” of unusable
space in the file that can't be reclaimed.
This issue is a high development priority for the HDF Group. In the meantime, if your
files seem unusually large you can “repack” them with the h5repack tool, which ships
with HDF5:
$ h5repack bigfile.hdf5 out.hdf5
Soft Links
Those of you who have used Linux or Mac OS X will be familiar with “soft” links. Unlike
“hard” links, which associate a link name with a particular object in the file, soft links
instead store the path to an object.
Here's an example. Let's create a file and populate it with a single group containing a
dataset:
>>> f = h5py . File ( 'test.hdf5' , 'w' )
>>> grp = f . create_group ( 'mygroup' )
>>> dset = grp . create_dataset ( 'dataset' , ( 100 ,))
Search WWH ::




Custom Search