Databases Reference
In-Depth Information
The
visitor
pattern is a little different from standard Python iteration, but is quite pow‐
erful once you get used to it. For example, here's a simple way to get a list of every single
object in the file:
>>>
mylist
=
[]
>>>
f
.
visit
(
mylist
.
append
)
As with all object names in the file, the names supplied to
visit
are
“text” strings (
unicode
on Python 2,
str
on Python 3). Keep this in
mind when writing your callbacks.
Multiple Links and visit
Of course, we know that an HDF5 file is not just a simple tree. Hard links are a great
way to share objects between groups. But how do they interact with
visit
?
Let's add a hard link to the subgroup we just explored (
top_group_1
), and run
visit
again to see what happens:
>>>
grp
[
'hardlink'
]
=
f
[
'top_group_2'
]
>>>
grp
.
visit
(
printname
)
hardlink
hardlink/sub_dataset_2
subgroup_1
subgroup_1/sub_dataset_1
Not bad. The group at
/top_group_2
is effectively “mounted” in the file
at
/top_group_1/hardlink
, and
visit
explores it correctly.
Now let's try something a little different. We'll undo that last hard link, and try to trick
visit
into visiting
sub_dataset_1
twice:
>>>
del
grp
[
'hardlink'
]
>>>
grp
[
'hardlink_to_dataset'
]
=
grp
[
'subgroup_1/sub_dataset_1'
]
>>>
grp
.
visit
(
printname
)
hardlink_to_dataset
subgroup_1
What happened? We didn't see
sub_dataset_1
in the output this time.
By design, each object in a file will be visited only
once
, regardless of how many links
exist to the object. Among other things, this eliminates the possibility of getting stuck
in an endless loop, as might happen if some clever person were to try the following:
>>>
f
[
'/root'
]
=
f
[
'/'
]
There is a trade-off. As we saw in our initial discussion of hard links, there's no such
thing as the “original” or “real” name for an object. So if multiple links point to your
dataset, when
visit
supplies a name it may not be the one you expect.