Databases Reference
In-Depth Information
Figure 9-1
shows schematically how this works. If you have multiple threads running,
if one of them calls into HDF5 (for example, to write a large dataset to disk), the others
will not proceed until the call completes.
Figure 9-1. Outline of a threading-based program using HDF5
The h5py package is “thread-safe,” in that you can safely share objects between threads
without corruption, and there's no global state that lets one thread stomp on another.
However, certain high-level operations are not yet guaranteed to be atomic. Therefore,
it's recommended that you manage access to your HDF5 objects by using recursive locks.
Here's an example: we'll create a single shared HDF5 file and two threads that do some
computation and write to it. Access to the file is managed using an instance of the
threading.RLock
class:
import
threading
import
time
import
random
import
numpy
as
np
import
h5py
f
=
h5py
.
File
(
"thread_demo.hdf5"
,
"w"
)
dset
=
f
.
create_dataset
(
"data"
,
(
2
,
1024
),
dtype
=
'f'
)
lock
=
threading
.
RLock
()
class
ComputeThread
(
threading
.
Thread
):
def
__init__
(
self
,
axis
):
self
.
axis
=
axis
# One thread does dset[0,:], the other dset[1, :].
threading
.
Thread
.
__init__
(
self
)
def
run
(
self
):
""" Perform a series of (simulated) computations and save to dataset.