Databases Reference
In-Depth Information
>>>
f
=
h5py
.
File
(
'refs_demo.hdf5'
,
'w'
)
>>>
grp1
=
f
.
create_group
(
'group1'
)
>>>
grp2
=
f
.
create_group
(
'group2'
)
>>>
dset
=
f
.
create_dataset
(
'mydata'
,
shape
=
(
100
,))
Looking at the group
grp1
, we notice an interesting property called
ref
:
>>>
grp1
.
ref
<HDF5 object reference>
The object returned from accessing
.ref
is an HDF5
object reference
. These are basically
pointers to objects in the file. You can “dereference” them by using the same syntax as
we used for string names:
>>>
out
=
f
[
grp1
.
ref
]
>>>
out
==
grp1
True
By the way, the Python type for these objects is available at
h5py.Reference
, in case you
want to use
isinstance
:
>>>
isinstance
(
grp1
.
ref
,
h5py
.
Reference
)
True
Since the reference is an “absolute” way of locating an object, you can use any group in
the file for dereferencing, not just the root group:
>>>
out
=
grp2
[
grp1
.
ref
]
>>>
out
==
grp1
True
But keep in mind they're local to the file. Trying to dereference them in the context of
another file will fail:
>>>
with
h5py
.
File
(
'anotherfile.hdf5'
,
'w'
)
as
f2
:
...
out
=
f2
[
grp1
.
ref
]
ValueError: unable dereference object
References as “Unbreakable” Links
So far there seems to be no improvement over using links. But there's an important
difference: you can store them as
data
, and they're independent of later renaming of the
objects involved.
Here's an example: suppose we wanted to add an attribute on one of our groups pointing
to the dataset
mydata
. We could simply record the name as an attribute:
>>>
grp1
.
attrs
[
'dataset'
]
=
dset
.
name
>>>
grp1
.
attrs
[
'dataset'
]
u'/mydata'
>>>
out
=
f
[
grp1
.
attrs
[
'dataset'
]]
>>>
out
==
dset
True