Databases Reference
In-Depth Information
Type Guessing
When you create a dataset, you generally specify the data type you want by providing a
NumPy dtype object. There are exceptions; for example, you can get a single-
precision float by omitting the dtype when calling create_dataset . But every dataset
has an explicit dtype, and you can always discover what it is via the .dtype property:
>>> dset . dtype
dtype('float32')
In contrast, with attributes h5py generally hides the type from you. It's important to
remember that there is a definite type in the HDF5 file. The dictionary-style interface
to attributes just means that it's usually inferred from what you provide.
Let's flush our file to disk with:
>>> f . flush ()
and look at it with h5ls :
$ h5ls -vlr attrsdemo.hdf5
Opened "attrsdemo.hdf5" with sec2 driver.
/ Group
Location: 1:96
Links: 1
/dataset Dataset {100/100}
Attribute: run_id scalar
Type: native int
Data: 144
Attribute: sample_rate scalar
Type: native double
Data: 1e+08
Attribute: title scalar
Type: variable-length null-terminated ASCII string
Data: "Dataset from third round of experiments"
Location: 1:800
Links: 1
Storage: 400 logical bytes, 0 allocated bytes
Type: native float
In most cases, the type is determined by simply passing the value to np.array and then
storing the resulting object. For integers on 32-bit systems you would get a 32-bit (“na‐
tive”) integer:
>>> np . array ( 144 ) . dtype
dtype('int32')
This explains the “native int” type for run_id .
You're not limited to scalar values, by the way. There's no problem storing whole NumPy
arrays in the file:
Search WWH ::




Custom Search